monitor-ci
# Monitor CI Command
What this skill does
# Monitor CI Command
You are the orchestrator for monitoring Nx Cloud CI pipeline executions and handling self-healing fixes. You spawn subagents to interact with Nx Cloud, run deterministic decision scripts, and take action based on the results.
## Context
- **Current Branch:** !`git branch --show-current`
- **Current Commit:** !`git rev-parse --short HEAD`
- **Remote Status:** !`git status -sb | head -1`
## User Instructions
$ARGUMENTS
**Important:** If user provides specific instructions, respect them over default behaviors described below.
## Configuration Defaults
| Setting | Default | Description |
| ------------------------- | ------------- | ------------------------------------------------------------------------- |
| `--max-cycles` | 10 | Maximum **agent-initiated** CI Attempt cycles before timeout |
| `--timeout` | 120 | Maximum duration in minutes |
| `--verbosity` | medium | Output level: minimal, medium, verbose |
| `--branch` | (auto-detect) | Branch to monitor |
| `--fresh` | false | Ignore previous context, start fresh |
| `--auto-fix-workflow` | false | Attempt common fixes for pre-CI-Attempt failures (e.g., lockfile updates) |
| `--new-cipe-timeout` | 10 | Minutes to wait for new CI Attempt after action |
| `--local-verify-attempts` | 3 | Max local verification + enhance cycles before pushing to CI |
Parse any overrides from `$ARGUMENTS` and merge with defaults.
## Nx Cloud Connection Check
Before starting the monitoring loop, verify the workspace is connected to Nx Cloud. Without this connection, no CI data is available and the entire skill is inoperable.
### Step 0: Verify Nx Cloud Connection
1. **Check `nx.json`** at workspace root for `nxCloudId` or `nxCloudAccessToken`
2. **If `nx.json` missing OR neither property exists** → exit with:
```
Nx Cloud not connected. Unlock 70% faster CI and auto-fix broken PRs with https://nx.dev/nx-cloud
```
3. **If connected** → continue to main loop
## Architecture Overview
1. **This skill (orchestrator)**: spawns subagents, runs scripts, prints status, does local coding work
2. **ci-monitor-subagent (haiku)**: calls one MCP tool (ci_information or update_self_healing_fix), returns structured result, exits
3. **ci-poll-decide.mjs (deterministic script)**: takes ci_information result + state, returns action + status message
4. **ci-state-update.mjs (deterministic script)**: manages budget gates, post-action state transitions, and cycle classification
## Status Reporting
The decision script handles message formatting based on verbosity. When printing messages to the user:
- Prepend `[monitor-ci]` to every message from the script's `message` field
- For your own action messages (e.g. "Applying fix via MCP..."), also prepend `[monitor-ci]`
## Anti-Patterns
These behaviors cause real problems — racing with self-healing, losing CI progress, or wasting context:
| Anti-Pattern | Why It's Bad |
| ----------------------------------------------------------------------------------------------- | ------------------------------------------------------------------ |
| Using CI provider CLIs with `--watch` flags (e.g., `gh pr checks --watch`, `glab ci status -w`) | Bypasses Nx Cloud self-healing entirely |
| Writing custom CI polling scripts | Unreliable, pollutes context, no self-healing |
| Cancelling CI workflows/pipelines | Destructive, loses CI progress |
| Running CI checks on main agent | Wastes main agent context tokens |
| Independently analyzing/fixing CI failures while polling | Races with self-healing, causes duplicate fixes and confused state |
**If this skill fails to activate**, the fallback is:
1. Use CI provider CLI for a one-time, read-only status check (single call, no watch/polling flags)
2. Immediately delegate to this skill with gathered context
3. Do not continue polling on main agent — it wastes context tokens and bypasses self-healing
## Session Context Behavior
If the user previously ran `/monitor-ci` in this session, you may have prior state (poll counts, last CI Attempt URL, etc.). Resume from that state unless `--fresh` is set, in which case discard it and start from Step 1.
## MCP Tool Reference
{%- if platform == "claude" %}
The `ci_information` and `update_self_healing_fix` tools are called via the **ci-monitor-subagent**, not directly from the orchestrator. Calling MCP tools directly wastes main agent context with large response payloads. The field sets below are for composing subagent prompts (see Step 2a).
{%- endif %}
Three field sets control polling efficiency — use the lightest set that gives you what you need:
```yaml
WAIT_FIELDS: 'cipeUrl,commitSha,cipeStatus'
LIGHT_FIELDS: 'cipeStatus,cipeUrl,branch,commitSha,selfHealingStatus,verificationStatus,userAction,failedTaskIds,verifiedTaskIds,selfHealingEnabled,failureClassification,couldAutoApplyTasks,autoApplySkipped,autoApplySkipReason,shortLink,confidence,confidenceReasoning,hints,selfHealingSkippedReason,selfHealingSkipMessage'
HEAVY_FIELDS: 'taskOutputSummary,suggestedFix,suggestedFixReasoning,suggestedFixDescription'
```
The `ci_information` tool accepts `branch` (optional, defaults to current git branch), `select` (comma-separated field names), and `pageToken` (0-based pagination for long strings).
The `update_self_healing_fix` tool accepts a `shortLink` and an action: `APPLY`, `REJECT`, or `RERUN_ENVIRONMENT_STATE`.
## Default Behaviors by Status
The decision script returns one of the following statuses. This table defines the **default behavior** for each. User instructions can override any of these.
**Simple exits** — just report and exit:
| Status | Default Behavior |
| ----------------------- | ---------------------------------------------------------------------------------------------------------------- |
| `ci_success` | Exit with success |
| `cipe_canceled` | Exit, CI was canceled |
| `cipe_timed_out` | Exit, CI timed out |
| `polling_timeout` | Exit, polling timeout reached |
| `circuit_breaker` | Exit, no progress after 13 consecutive polls |
| `environment_rerun_cap` | Exit, environment reruns exhausted |
| `fix_auto_applying` | Self-healing is handling it — just record `last_cipe_url`, enter wait mode. No MCP call or local git ops needed. |
| `error` | Wait 60s and loop |
**StaRelated in Cloud & DevOps
appbuilder-action-scaffolder
IncludedCreate, implement, deploy, and debug Adobe Runtime actions with consistent layout, validation, and error handling. Use this skill whenever the user needs to add actions to an App Builder project, understand action structure (params, response format, web/raw actions), configure actions in the manifest, use App Builder SDKs (State, Files, Events, database), deploy and invoke actions via CLI, debug action issues, or implement patterns such as webhook receivers, custom event providers, journaling consumers, large payload redirects, action sequence pipelines, and Asset Compute workers. Also trigger when users mention serverless functions in Adobe context, action logging, IMS authentication for actions, or cron-style scheduled actions.
orchestrating-datacloud
IncludedSalesforce Data Cloud product orchestrator for connect→prepare→harmonize→segment→act workflows. Use this skill when the user needs a multi-step Data Cloud pipeline, cross-phase troubleshooting, or data space and data kit management. TRIGGER when: user needs a multi-step Data Cloud pipeline, asks to set up or troubleshoot Data Cloud across phases, manages data spaces or data kits, or wants a cross-phase sf data360 workflow. DO NOT TRIGGER when: work is isolated to a single phase (use the matching phase-specific skill), the task is STDM/session tracing/parquet telemetry (use observing-agentforce), standard CRM SOQL (use querying-soql), or Apex implementation (use generating-apex).
github-project-automation
IncludedAutomate GitHub repository setup with CI/CD workflows, issue templates, Dependabot, and CodeQL security scanning. Includes 12 production-tested workflows and prevents 18 errors: YAML syntax, action pinning, and configuration. Use when: setting up GitHub Actions CI/CD, creating issue/PR templates, enabling Dependabot or CodeQL scanning, deploying to Cloudflare Workers, implementing matrix testing, or troubleshooting YAML indentation, action version pinning, secrets syntax, runner versions, or CodeQL configuration. Keywords: github actions, github workflow, ci/cd, issue templates, pull request templates, dependabot, codeql, security scanning, yaml syntax, github automation, repository setup, workflow templates, github actions matrix, secrets management, branch protection, codeowners, github projects, continuous integration, continuous deployment, workflow syntax error, action version pinning, runner version, github context, yaml indentation error
sf-datacloud
IncludedSalesforce Data Cloud product orchestrator for connect→prepare→harmonize→segment→act workflows. TRIGGER when: user needs a multi-step Data Cloud pipeline, asks to set up or troubleshoot Data Cloud across phases, manages data spaces or data kits, or wants a cross-phase `sf data360` workflow. DO NOT TRIGGER when: work is isolated to a single phase (use the matching sf-datacloud-* skill), the task is STDM/session tracing/parquet telemetry (use sf-ai-agentforce-observability), standard CRM SOQL (use sf-soql), or Apex implementation (use sf-apex).
fabric-cli
IncludedUse this skill for Fabric.so CLI workflows with the `fabric` terminal command: diagnose/install/login, search or browse a Fabric library, save notes/links/files, create folders, ask the Fabric AI assistant, manage tasks/workspaces, generate shell completion, check subscription usage, produce JSON output, and use Fabric as persistent agent memory. Do not use for Microsoft Fabric/Azure/Power BI `fab`, Daniel Miessler's Fabric framework, Python Fabric SSH, Fabric.js, or textile/fashion fabric.
lark
IncludedLark/Feishu CLI skills: lark-cli operations for docs, markdown, sheets, base, calendar, im, mail, task, okr, drive, wiki, slides, whiteboard, apps, approval, attendance, contact, vc, minutes, event. Use when the user needs to operate Lark/Feishu resources via lark-cli, send messages, manage documents, spreadsheets, calendars, tasks, OKRs, deploy web pages, or any Feishu/Lark workspace operations.