Claude
Skills
Sign in
Back

zdx-investigate-multi-app-outage

Included with Lifetime
$97 forever

Diagnose a multi-application outage scoped to one location by correlating ZDX alerts, affected devices, and shared cloud-path hops. Identifies the devices affected at a specific office, compares the per-application network path across multiple SaaS apps to surface the common network bottleneck, and produces an evidence-backed recommendation. Use when an admin reports: 'A ZDX alert shows users in the Columbus office cannot reach Salesforce and ServiceNow', 'Identify affected devices and the common network path issue', 'Multiple users at Dallas are having issues with several SaaS apps', 'What do these failing apps have in common at this office?', or 'Find the shared network bottleneck for the New York users hitting Workday and Box.'

Cloud & DevOps

What this skill does


# ZDX: Investigate Multi-Application Outage at a Location

## Keywords

multi-app outage, location outage, office outage, common network path, shared bottleneck, cross-app correlation, hop analysis, ISP issue, cloud path, multiple applications affected, regional outage, site-wide issue, salesforce slow, servicenow slow, cross-tenant impact, shared infrastructure

## Overview

Diagnose a scenario where users at one specific office or location are unable to reach **multiple SaaS applications simultaneously**. The skill correlates active ZDX alerts, scopes impact by location, intersects the affected-device lists across the impacted apps, and then compares the per-application network path (cloud path) on a common affected device to identify the **shared upstream segment** that is the actual root cause.

**Use this skill when:** An admin receives a ZDX alert (or user complaint) that names a single office and two or more applications — e.g., "Columbus users can't reach Salesforce and ServiceNow" — and the desired output is a list of affected devices plus an evidence-backed identification of the shared network path issue.

**ZDX Copilot alignment:** Troubleshooting + Optimization — converts a coarse "users at office X can't reach apps Y/Z" complaint into a hop-level root cause and a prioritized remediation list.

---

## ⚠ HTML OUTPUT — READ THIS BEFORE PRODUCING ANY HTML

**You MUST NOT hand-write or copy/paste an HTML page for this skill.**
There is exactly one acceptable way to produce the HTML output:

1. **Read this file from disk** — do NOT inline a copy in your response. The template lives next to this SKILL.md inside the skill's package, at:

   ```text
   ./templates/report.html.template
   ```

   The `./` prefix is intentional: this path is **relative to the skill folder** (the directory containing this SKILL.md), **never** an absolute path. Most agents that load skills from an uploaded `.zip` extract the package into a working directory and expose its contents via that relative path — read the file by joining the skill's own root directory with `./templates/report.html.template`. Do not rewrite this to an absolute path that points at the author's machine.

2. **Build a single JSON object** (`__ZDX_DATA__` payload) shaped exactly as documented in the *Data Payload Contract* section below. Aggregate the responses from the ZDX MCP tool calls (Steps 1–7 of the *Workflow*) into that object.

3. **Replace** the literal token `__ZDX_DATA__` (which appears once, inside `<script type="application/json" id="zdx-data">__ZDX_DATA__</script>`) with the JSON object. Do not edit any other part of the template.

4. **Write** the result to disk as `multi_app_outage_<location>_<YYYYMMDD-HHMMSS>.html` next to the .docx, and give the user a `computer://` link to it.

This template already provides: Zscaler header with logo · sticky top bar · ACTIVE INCIDENT pill · scope summary bar · color-coded incident banner · 6 KPI cards with severity-coded top borders and subtitles · per-table search + filter chips · sortable color-coded tables · per-table CSV export · light/dark theme toggle · top-right language dropdown (EN / ES / PT / FR / JA) · printable PDF view · localStorage prefs · Analysis / Root Cause / Remediation block.

**If you find yourself writing `<html>`, `<style>`, or `<table>` in a code-block destined for the user, stop. Read the template instead.**

A populated reference rendering ships with this skill at `./example/report.example.html` (relative to the skill folder). Open it in a browser to preview the exact layout and depth expected.

### Data Payload Contract

The full `__ZDX_DATA__` payload is one JSON object. Every field below is **required** unless marked optional.

```json
{
  "generated_at": "2026-05-18T12:00:00Z",

  "scope": {
    "office":    "Columbus Office (ID: 73260557)",
    "apps":      "Salesforce Lightning (ID 18) & ServiceNow (ID 5)",
    "window":    "Last 24h",
    "generated": "2026-05-18T12:00:00Z"
  },

  "incident": {
    "severity": "critical",
    "title":    "Critical — Multi-App Availability Loss at Columbus Office",
    "body":     "2–4 sentences of plain-language explanation: which apps, which alerts (by ID), what the score and PFT/DNS numbers actually mean, what the recent-3-datapoint pattern shows. Don't be terse — the admin reads this first."
  },

  "kpis": {
    "office":          { "value": "Columbus",          "sub": "Location ID 73260557", "severity": "info" },
    "appsAffected":    { "value": 2,                   "sub": "Salesforce Lightning · ServiceNow", "severity": "critical" },
    "affectedDevices": { "value": 5,                   "sub": "Across both alerts (combined)", "severity": "critical" },
    "avgScore":        { "value": "14 / 100",          "sub": "Critical threshold < 34", "severity": "critical" },
    "sharedHop":       { "value": "Hop 3 — Local ISP", "sub": "203.0.113.4 — 4.2–4.8% loss", "severity": "critical" },
    "deepTrace":       { "value": "⚠ No Data",         "sub": "New traces required", "severity": "warning" }
  },

  "columnOverrides": {
    "hops": {
      "app1Latency": "<App1> Latency",
      "app1Loss":    "<App1> Loss",
      "app2Latency": "<App2> Latency",
      "app2Loss":    "<App2> Loss"
    }
  },

  "tables": {
    "devices": [
      {
        "severity":    "Critical",
        "scope":       "ServiceNow Only",
        "device":      "PC-Aniru-71",
        "user":        "Anirudh Singh\[email protected]",
        "os":          "Windows 11 Pro",
        "deviceId":    "86937887",
        "appsFailing": "ServiceNow"
      }
    ],
    "appMetrics": [
      {
        "severity":      "critical",
        "application":   "Salesforce Lightning",
        "appId":         18,
        "zdxScoreAvg":   15.5,
        "zdxScoreMin":   0,
        "pftAvg":        "7,541 ms",
        "pftMax":        "11,855 ms",
        "dnsAvg":        "7,022 ms",
        "probeFailures": "3 / 24 probes",
        "status":        "Critical"
      }
    ],
    "hops": [
      {
        "severity":    "critical",
        "verdict":     "SHARED",
        "hop":         3,
        "address":     "203.0.113.4 (Local ISP)",
        "app1Latency": "248ms", "app1Loss": "4.2%",
        "app2Latency": "255ms", "app2Loss": "4.8%"
      }
    ]
  },

  "analysis": {
    "summary":     "3–5 sentence narrative explaining what the numbers say across the two apps, why the matching metric profile is the strongest correlation signal, and what was ruled out by comparison.",
    "rootCause":   "1–3 sentence statement of the shared bottleneck with the supporting evidence — name the exact hop or segment and quote the latency/loss numbers that pin it.",
    "remediation": [
      "Immediate (within 1 hour): … (concrete action with target)",
      "Capture evidence (within 2 hours): start two deep-trace sessions on device <id> …",
      "Investigate (within 4 hours): … (specific systems to check)",
      "Monitor: schedule a recurring 15-minute deep trace every 4 hours against one affected device per app …",
      "Communicate: notify the affected user list that the issue is identified as ISP-side and Zscaler / apps are not at fault."
    ]
  }
}
```

#### Field rules

- **`generated_at`** — ISO-8601 UTC timestamp.
- **`scope`** — REQUIRED object with `office`, `apps`, `window`, `generated`. The keys map to translated labels (`Office`, `Apps`, `Window`, `Generated`) automatically. Include the location ID and app IDs inline in the values, as shown.
- **`incident`** — REQUIRED object. `severity` is `"critical"` or `"warning"`. The `body` is plain text (no HTML); be specific — quote alert IDs, score ranges, exact metric values, and what the most recent probe attempts returned.
- **`kpis`** — REQUIRED. Every KPI value must be `{value, sub, severity}` (not just a primitive). `severity` controls the top-border color and the number color: `"critical" | "warning" | "good" | "info" | "neutral"`.
- **`columnOve

Related in Cloud & DevOps