zdx-investigate-multi-app-outage

Included with Lifetime

$97 forever

Diagnose a multi-application outage scoped to one location by correlating ZDX alerts, affected devices, and shared cloud-path hops. Identifies the devices affected at a specific office, compares the per-application network path across multiple SaaS apps to surface the common network bottleneck, and produces an evidence-backed recommendation. Use when an admin reports: 'A ZDX alert shows users in the Columbus office cannot reach Salesforce and ServiceNow', 'Identify affected devices and the common network path issue', 'Multiple users at Dallas are having issues with several SaaS apps', 'What do these failing apps have in common at this office?', or 'Find the shared network bottleneck for the New York users hitting Workday and Box.'

Cloud & DevOps

What this skill does


# ZDX: Investigate Multi-Application Outage at a Location

## Keywords

multi-app outage, location outage, office outage, common network path, shared bottleneck, cross-app correlation, hop analysis, ISP issue, cloud path, multiple applications affected, regional outage, site-wide issue, salesforce slow, servicenow slow, cross-tenant impact, shared infrastructure

## Overview

Diagnose a scenario where users at one specific office or location are unable to reach **multiple SaaS applications simultaneously**. The skill correlates active ZDX alerts, scopes impact by location, intersects the affected-device lists across the impacted apps, and then compares the per-application network path (cloud path) on a common affected device to identify the **shared upstream segment** that is the actual root cause.

**Use this skill when:** An admin receives a ZDX alert (or user complaint) that names a single office and two or more applications — e.g., "Columbus users can't reach Salesforce and ServiceNow" — and the desired output is a list of affected devices plus an evidence-backed identification of the shared network path issue.

**ZDX Copilot alignment:** Troubleshooting + Optimization — converts a coarse "users at office X can't reach apps Y/Z" complaint into a hop-level root cause and a prioritized remediation list.

---

## ⚠ HTML OUTPUT — READ THIS BEFORE PRODUCING ANY HTML

**You MUST NOT hand-write or copy/paste an HTML page for this skill.**
There is exactly one acceptable way to produce the HTML output:

1. **Read this file from disk** — do NOT inline a copy in your response. The template lives next to this SKILL.md inside the skill's package, at:

   ```text
   ./templates/report.html.template
   ```

   The `./` prefix is intentional: this path is **relative to the skill folder** (the directory containing this SKILL.md), **never** an absolute path. Most agents that load skills from an uploaded `.zip` extract the package into a working directory and expose its contents via that relative path — read the file by joining the skill's own root directory with `./templates/report.html.template`. Do not rewrite this to an absolute path that points at the author's machine.

2. **Build a single JSON object** (`__ZDX_DATA__` payload) shaped exactly as documented in the *Data Payload Contract* section below. Aggregate the responses from the ZDX MCP tool calls (Steps 1–7 of the *Workflow*) into that object.

3. **Replace** the literal token `__ZDX_DATA__` (which appears once, inside `<script type="application/json" id="zdx-data">__ZDX_DATA__</script>`) with the JSON object. Do not edit any other part of the template.

4. **Write** the result to disk as `multi_app_outage_<location>_<YYYYMMDD-HHMMSS>.html` next to the .docx, and give the user a `computer://` link to it.

This template already provides: Zscaler header with logo · sticky top bar · ACTIVE INCIDENT pill · scope summary bar · color-coded incident banner · 6 KPI cards with severity-coded top borders and subtitles · per-table search + filter chips · sortable color-coded tables · per-table CSV export · light/dark theme toggle · top-right language dropdown (EN / ES / PT / FR / JA) · printable PDF view · localStorage prefs · Analysis / Root Cause / Remediation block.

**If you find yourself writing `<html>`, `<style>`, or `<table>` in a code-block destined for the user, stop. Read the template instead.**

A populated reference rendering ships with this skill at `./example/report.example.html` (relative to the skill folder). Open it in a browser to preview the exact layout and depth expected.

### Data Payload Contract

The full `__ZDX_DATA__` payload is one JSON object. Every field below is **required** unless marked optional.

```json
{
  "generated_at": "2026-05-18T12:00:00Z",

  "scope": {
    "office":    "Columbus Office (ID: 73260557)",
    "apps":      "Salesforce Lightning (ID 18) & ServiceNow (ID 5)",
    "window":    "Last 24h",
    "generated": "2026-05-18T12:00:00Z"
  },

  "incident": {
    "severity": "critical",
    "title":    "Critical — Multi-App Availability Loss at Columbus Office",
    "body":     "2–4 sentences of plain-language explanation: which apps, which alerts (by ID), what the score and PFT/DNS numbers actually mean, what the recent-3-datapoint pattern shows. Don't be terse — the admin reads this first."
  },

  "kpis": {
    "office":          { "value": "Columbus",          "sub": "Location ID 73260557", "severity": "info" },
    "appsAffected":    { "value": 2,                   "sub": "Salesforce Lightning · ServiceNow", "severity": "critical" },
    "affectedDevices": { "value": 5,                   "sub": "Across both alerts (combined)", "severity": "critical" },
    "avgScore":        { "value": "14 / 100",          "sub": "Critical threshold < 34", "severity": "critical" },
    "sharedHop":       { "value": "Hop 3 — Local ISP", "sub": "203.0.113.4 — 4.2–4.8% loss", "severity": "critical" },
    "deepTrace":       { "value": "⚠ No Data",         "sub": "New traces required", "severity": "warning" }
  },

  "columnOverrides": {
    "hops": {
      "app1Latency": "<App1> Latency",
      "app1Loss":    "<App1> Loss",
      "app2Latency": "<App2> Latency",
      "app2Loss":    "<App2> Loss"
    }
  },

  "tables": {
    "devices": [
      {
        "severity":    "Critical",
        "scope":       "ServiceNow Only",
        "device":      "PC-Aniru-71",
        "user":        "Anirudh Singh\[email protected]",
        "os":          "Windows 11 Pro",
        "deviceId":    "86937887",
        "appsFailing": "ServiceNow"
      }
    ],
    "appMetrics": [
      {
        "severity":      "critical",
        "application":   "Salesforce Lightning",
        "appId":         18,
        "zdxScoreAvg":   15.5,
        "zdxScoreMin":   0,
        "pftAvg":        "7,541 ms",
        "pftMax":        "11,855 ms",
        "dnsAvg":        "7,022 ms",
        "probeFailures": "3 / 24 probes",
        "status":        "Critical"
      }
    ],
    "hops": [
      {
        "severity":    "critical",
        "verdict":     "SHARED",
        "hop":         3,
        "address":     "203.0.113.4 (Local ISP)",
        "app1Latency": "248ms", "app1Loss": "4.2%",
        "app2Latency": "255ms", "app2Loss": "4.8%"
      }
    ]
  },

  "analysis": {
    "summary":     "3–5 sentence narrative explaining what the numbers say across the two apps, why the matching metric profile is the strongest correlation signal, and what was ruled out by comparison.",
    "rootCause":   "1–3 sentence statement of the shared bottleneck with the supporting evidence — name the exact hop or segment and quote the latency/loss numbers that pin it.",
    "remediation": [
      "Immediate (within 1 hour): … (concrete action with target)",
      "Capture evidence (within 2 hours): start two deep-trace sessions on device <id> …",
      "Investigate (within 4 hours): … (specific systems to check)",
      "Monitor: schedule a recurring 15-minute deep trace every 4 hours against one affected device per app …",
      "Communicate: notify the affected user list that the issue is identified as ISP-side and Zscaler / apps are not at fault."
    ]
  }
}
```

#### Field rules

- **`generated_at`** — ISO-8601 UTC timestamp.
- **`scope`** — REQUIRED object with `office`, `apps`, `window`, `generated`. The keys map to translated labels (`Office`, `Apps`, `Window`, `Generated`) automatically. Include the location ID and app IDs inline in the values, as shown.
- **`incident`** — REQUIRED object. `severity` is `"critical"` or `"warning"`. The `body` is plain text (no HTML); be specific — quote alert IDs, score ranges, exact metric values, and what the most recent probe attempts returned.
- **`kpis`** — REQUIRED. Every KPI value must be `{value, sub, severity}` (not just a primitive). `severity` controls the top-border color and the number color: `"critical" | "warning" | "good" | "info" | "neutral"`.
- **`columnOve

Files: 5

Size: 891.9 KB

Complexity: 45/100

Category: Cloud & DevOps

Source: https://github.com/zscaler/zscaler-mcp-server/tree/main/skills/zdx/investigate-multi-app-outage

Related in Cloud & DevOps

appbuilder-action-scaffolder

Included

Create, implement, deploy, and debug Adobe Runtime actions with consistent layout, validation, and error handling. Use this skill whenever the user needs to add actions to an App Builder project, understand action structure (params, response format, web/raw actions), configure actions in the manifest, use App Builder SDKs (State, Files, Events, database), deploy and invoke actions via CLI, debug action issues, or implement patterns such as webhook receivers, custom event providers, journaling consumers, large payload redirects, action sequence pipelines, and Asset Compute workers. Also trigger when users mention serverless functions in Adobe context, action logging, IMS authentication for actions, or cron-style scheduled actions.

Cloud & DevOpsscripts

orchestrating-datacloud

Included

Salesforce Data Cloud product orchestrator for connect→prepare→harmonize→segment→act workflows. Use this skill when the user needs a multi-step Data Cloud pipeline, cross-phase troubleshooting, or data space and data kit management. TRIGGER when: user needs a multi-step Data Cloud pipeline, asks to set up or troubleshoot Data Cloud across phases, manages data spaces or data kits, or wants a cross-phase sf data360 workflow. DO NOT TRIGGER when: work is isolated to a single phase (use the matching phase-specific skill), the task is STDM/session tracing/parquet telemetry (use observing-agentforce), standard CRM SOQL (use querying-soql), or Apex implementation (use generating-apex).

Cloud & DevOpsscripts

github-project-automation

Included

Automate GitHub repository setup with CI/CD workflows, issue templates, Dependabot, and CodeQL security scanning. Includes 12 production-tested workflows and prevents 18 errors: YAML syntax, action pinning, and configuration. Use when: setting up GitHub Actions CI/CD, creating issue/PR templates, enabling Dependabot or CodeQL scanning, deploying to Cloudflare Workers, implementing matrix testing, or troubleshooting YAML indentation, action version pinning, secrets syntax, runner versions, or CodeQL configuration. Keywords: github actions, github workflow, ci/cd, issue templates, pull request templates, dependabot, codeql, security scanning, yaml syntax, github automation, repository setup, workflow templates, github actions matrix, secrets management, branch protection, codeowners, github projects, continuous integration, continuous deployment, workflow syntax error, action version pinning, runner version, github context, yaml indentation error

Cloud & DevOpsscripts

sf-datacloud

Included

Salesforce Data Cloud product orchestrator for connect→prepare→harmonize→segment→act workflows. TRIGGER when: user needs a multi-step Data Cloud pipeline, asks to set up or troubleshoot Data Cloud across phases, manages data spaces or data kits, or wants a cross-phase `sf data360` workflow. DO NOT TRIGGER when: work is isolated to a single phase (use the matching sf-datacloud-* skill), the task is STDM/session tracing/parquet telemetry (use sf-ai-agentforce-observability), standard CRM SOQL (use sf-soql), or Apex implementation (use sf-apex).

Cloud & DevOpsscripts

fabric-cli

Included

Use this skill for Fabric.so CLI workflows with the `fabric` terminal command: diagnose/install/login, search or browse a Fabric library, save notes/links/files, create folders, ask the Fabric AI assistant, manage tasks/workspaces, generate shell completion, check subscription usage, produce JSON output, and use Fabric as persistent agent memory. Do not use for Microsoft Fabric/Azure/Power BI `fab`, Daniel Miessler's Fabric framework, Python Fabric SSH, Fabric.js, or textile/fashion fabric.

Cloud & DevOpsscripts

lark

Included

Lark/Feishu CLI skills: lark-cli operations for docs, markdown, sheets, base, calendar, im, mail, task, okr, drive, wiki, slides, whiteboard, apps, approval, attendance, contact, vc, minutes, event. Use when the user needs to operate Lark/Feishu resources via lark-cli, send messages, manage documents, spreadsheets, calendars, tasks, OKRs, deploy web pages, or any Feishu/Lark workspace operations.

Cloud & DevOpsscripts