alibabacloud-data-agent-skill

Included with Lifetime

$97 forever

Invoke Alibaba Cloud Apsara Data Agent for Analytics via CLI to perform natural language-driven data analysis on enterprise databases. Data Agent for Analytics is an intelligent data analysis agent developed by Alibaba Cloud Database team for enterprise users. It automatically completes requirement analysis, data understanding, analysis insights, and report generation based on natural language descriptions. This tool supports: discovering data resources (instances/databases/tables) managed in DMS, initiating query or deep analysis sessions, real-time progress tracking, and retrieving analysis conclusions and generated reports. Use this Skill when users need to query databases, analyze data trends, generate data reports, ask questions in natural language, or mention "Data Agent", "data analysis", "database query", "SQL analysis", "data insights".

Backend & APIsscriptsassets

What this skill does

metadata:
  author: DataAgent Team
  version: "1.8.5"
---

# Changelog
- **v1.8.5** — Database listing migrated to `ListTagMetaAsset` (dms-enterprise 2018-11-01); workspace auto-resolution (CLI `--workspace-id` > env `DATA_AGENT_WORKSPACE_ID` > `InitDataAgentPersonalWorkspace`); `db` subcommand relaxed `--dms-instance-id` / `--instance-name` to optional.
- **v1.8.4**: Document project Python virtualenv (`venv/`) setup and activation; add end-to-end regression notes for ASK_DATA / ANALYSIS (async + attach)
- **v1.8.3**: `db` and `file` subcommands now accept `--session-mode CLAW`
- **v1.8.2**: `SendChatMessage` now supports per-message `Mode=CLAW` (injected via `SessionConfig.Mode`); dynamic DMSUnit resolution via `GetActiveRouteUnit`
- **v1.8.1**: Emphasize `attach`-based session reuse as the core interaction mechanism; add golden workflow, capability matrix, and usage rules
- **v1.8.0**: Add workspace (collaborative space) support, add custom agent support
- **v1.7.2**: Use Alibaba Cloud default credential chain instead of explicit AK/SK, add User-Agent header, fix RAM policy wildcard issues
- **v1.7.1**: Fix CLI `ls` command API response parsing (support case-insensitive field names), optimize SKILL documentation structure, separate ANALYSIS mode specification document
- **v1.7.0**: API_KEY authentication support, native async execution mode, session isolation, enhanced attach mode, optimized log output

---

---

# Installation

## Python Environment (venv) — MUST READ

> **🚨 Hard Requirement: Python ≥ 3.10**
>
> The macOS system `/usr/bin/python3` is typically 3.8 or 3.9 and **cannot run this project** (it relies on `match/case`, `TypeAlias`, `|` union syntax, and other 3.10+ features).
>
> Verify your version first: `python3 --version`. If below 3.10, install via Homebrew or pyenv:
> ```bash
> # Homebrew
> brew install [email protected]
> # Or pyenv
> pyenv install 3.12.4 && pyenv local 3.12.4
> ```

> **⚠️ You MUST use a venv virtual environment. Never install dependencies globally.** Running `pip install` against the system Python pollutes the environment and may fail due to permission issues.

### Use Existing venv (Recommended)

The project ships a pre-built `venv/` directory (all dependencies pre-installed). Use it whenever possible:

```bash
cd alibabacloud-data-agent-skill

# Option A (recommended): activate the venv
source venv/bin/activate
python3 scripts/data_agent_cli.py ls

# Option B: invoke the venv interpreter directly (no activation needed)
venv/bin/python3 scripts/data_agent_cli.py ls
```

### Rebuild venv

If `venv/` is missing or dependencies are corrupted, recreate with a **3.10+** Python:

```bash
python3.12 -m venv venv          # explicitly use a 3.10+ interpreter
source venv/bin/activate
pip install -r scripts/requirements.txt
```

> **Tip**: All examples in this document write `python3 scripts/data_agent_cli.py ...`. When venv is activated, `python3` resolves to the venv interpreter automatically; otherwise prefix with `venv/bin/python3`.

## Configure Credentials

This Skill uses Alibaba Cloud default credential chain (recommended) or API_KEY authentication.

### Option 1: Default Credential Chain (Recommended)

The Skill uses Alibaba Cloud SDK's default credential chain to automatically obtain credentials, supporting environment variables, configuration files, instance roles, etc.

See [Alibaba Cloud Credential Chain Documentation](https://help.aliyun.com/document_detail/378659.html)

### Option 2: API_KEY Authentication (File Analysis Only)

```bash
export DATA_AGENT_API_KEY=your-api-key
export DATA_AGENT_REGION=cn-hangzhou
```

Get API_KEY: [Data Agent Console](https://agent.dms.aliyun.com/cn-hangzhou/api-key)

### Permission Requirements

RAM users need `AliyunDMSFullAccess` or `AliyunDMSDataAgentFullAccess` permissions.
See [RAM-POLICIES.md](references/RAM-POLICIES.md) for detailed permission information.

## Debug Mode

```bash
DATA_AGENT_DEBUG_API=1 python3 scripts/data_agent_cli.py file example.csv -q "analyze"
```

## 💡 Getting Started Tips

- Use the built-in demo database `internal_data_employees` (DataAgent's built-in test database containing employee, department, and salary data) for first-time experience
- Or use local file `assets/example_game_data.csv` for file analysis experience


# Data Agent CLI — Unified Command-Line Data Analysis Tool

## Overview

`scripts/data_agent_cli.py` helps users complete the full workflow from **discover data → initiate analysis → track progress → get results**.

### Core Concepts

> **⚠️ Key Prerequisite**: Data Agent can only analyze databases that have been **imported into Data Agent Data Center**.
>
> - **Data Center**: Data Agent's data center, only databases here can be analyzed
> - **DMS**: Alibaba Cloud Data Management Service, stores metadata of all databases
> - **Relationship**: Databases registered in DMS ≠ Databases in Data Center
>
> **Usage Flow**:
> 1. First use `ls` to check if the target database exists in Data Center
> 2. If **not found**, use `dms` subcommand to search for database info, then use `import` subcommand to import it
> 3. After successful import, you can use `db` subcommand for analysis

---

## Analysis Modes

- **ASK_DATA** (default): Synchronous execution, sub-second response, suitable for quick Q&A
- **ANALYSIS**: Deep analysis, takes 5-40 minutes, requires spawning a sub-agent for async execution or using --async-run parameter
- **INSIGHT**: Insight-oriented exploration, follows the same plan-confirmation flow as ANALYSIS
- **CLAW**: Agentic CLAW mode. Two entry points:
  - CLI: `db --session-mode CLAW ...` / `file --session-mode CLAW ...` (session-level)
  - SDK: pass `mode="CLAW"` to `client.send_message(...)` / `AsyncDataAgentClient.send_message(...)` to override mode for a single message via `SessionConfig.Mode`

### End-to-End Regression Reference (v1.8.4 verified)

Both ASK_DATA and ANALYSIS modes are regression-tested against `chinook` database with the async + attach flow:

| Mode | Kickoff | Observed Chain | Typical Duration |
|------|---------|----------------|------------------|
| ASK_DATA | `db --session-mode ASK_DATA -q "..."` | async worker → live SSE → `result.json={"status":"completed"}` | ~15s |
| ANALYSIS | `db --session-mode ANALYSIS -q "..."` | async worker → **Plan** → `WAIT_INPUT` → `attach -q "confirm"` → step-by-step execution → Excel/Chart artifacts → text report → **2nd WAIT_INPUT** (webpage render) | 2-10 min (text); +10 min if rendering webpage |

Key checkpoints to look for in `sessions/<SESSION_ID>/progress.log`:

- `> User Query: ...` — request received
- `### Execution Plan (ID: ...)` — ANALYSIS plan generated, use `attach -q "confirm"` to proceed
- `> ⚠️  Plan confirmed, continuing analysis...` — plan approved, execution starts
- `## Step N/M: ...` — per-step progress with artifacts links
- `### Report Render` + `⚠️  Please review the report rendering request.` — optional HTML report render confirmation

> See [ANALYSIS_MODE.md](references/ANALYSIS_MODE.md) for details

---

## Workspace (Collaborative Space)

Workspaces are collaborative spaces that enable team-based data analysis with shared sessions, data sources, and access control.

- **List workspaces**: Use `workspace` subcommand to discover available workspaces (personal or shared)
- **Bind session to workspace**: Pass `--workspace-id <ID>` when using `db` or `file` to create a session within a specific workspace context
- **Workspace types**: `MY` (default, personal spaces), `ALL` (all accessible spaces including shared ones)

> **Note**: When a session is created within a workspace, all subsequent API calls (describe, send message, etc.) automatically carry the workspace context.

### Workspace Resolution

The workspace ID is resolved automatically in this order:
1. CLI flag `--workspace-id <id>`
2. Environment variable `DATA_AGENT_WORKSPACE_ID`
3. Auto-create personal workspace via `InitDataAgentPersonalWorkspace`

Both AK/SK and API_KEY authent

Files: 41

Size: 1980.3 KB

Complexity: 100/100

Category: Backend & APIs

Source: https://github.com/aliyun/alibabacloud-aiops-skills/tree/main/skills/database/dms/alibabacloud-data-agent-skill

Related in Backend & APIs

jfrog

Included

Interact with the JFrog Platform via the JFrog CLI and REST/GraphQL APIs. Use this skill when the user wants to manage Artifactory repositories, upload or download artifacts, manage builds, configure permissions, manage users and groups, work with access tokens, configure JFrog CLI servers, search artifacts, manage properties, set up replication, manage JFrog Projects, run security audits or scans, look up CVE details, query exposures scan results from JFrog Advanced Security, manage release bundles and lifecycle operations, aggregate or export platform data, or perform any JFrog Platform administration task. Also use when the user mentions jf, jfrog, artifactory, xray, distribution, evidence, apptrust, onemodel, graphql, workers, mission control, curation, advanced security, exposures, or any JFrog product name.

Backend & APIsscripts

cupynumeric-migration-readiness

Included

Pre-migration readiness assessor for porting NumPy to cuPyNumeric. Use BEFORE substantial porting work begins when the user asks whether code will scale on GPU, whether they should migrate to cuPyNumeric, which NumPy patterns transfer cleanly, what must be refactored before porting, or mentions pre-port assessment, scaling analysis, or refactor planning. Inspect the user's source code, look up NumPy usage, cross-reference the cuPyNumeric API support manifest, and distinguish distributed-scaling-friendly patterns from blockers such as unsupported APIs, scalar synchronization, host round-trips, Python/object-heavy control flow, shape/data-dependent branching, and in-place mutation hazards. Produce a verdict of READY, LIGHT REFACTOR, SIGNIFICANT REFACTOR, or NOT RECOMMENDED, with concrete refactor pointers.

Backend & APIsscripts

token-optimizer

Included

Reduce OpenClaw token usage and API costs through smart model routing, heartbeat optimization, budget tracking, and native 2026.2.15 features (session pruning, bootstrap size limits, cache TTL alignment). Use when token costs are high, API rate limits are being hit, or hosting multiple agents at scale. The 4 executable scripts (context_optimizer, model_router, heartbeat_optimizer, token_tracker) are local-only — no network requests, no subprocess calls, no system modifications. Reference files (PROVIDERS.md, config-patches.json) document optional multi-provider strategies that require external API keys and network access if you choose to use them. See SECURITY.md for full breakdown.

Backend & APIsscripts

resend-cli

Included

Use this skill when the task is specifically about operating Resend from an AI agent, terminal session, or CI job via the official resend CLI: installing/authenticating the CLI, sending/listing/updating/cancelling emails, batch sends, domains and DNS, webhooks and local listeners, inbound receiving, contacts, topics, segments, broadcasts, templates, API keys, profiles, or debugging Resend CLI/API failures. Trigger on mentions of Resend CLI, `resend`, `resend doctor`, `resend emails send`, `resend domains`, `resend webhooks listen`, `resend emails receiving`, or agent-friendly terminal automation.

Backend & APIsscripts

alibabacloud-odps-maxframe-coding

Included

Use this skill for MaxFrame SDK development and documentation navigation on Alibaba Cloud MaxCompute (ODPS). Helps answer MaxFrame API, concept, official example, and supported pandas API questions; create data processing programs; read/write MaxCompute tables; debug jobs (remote or local); and build custom DPE runtime images. Trigger when users mention MaxFrame, MaxCompute with MaxFrame, ODPS table processing, DPE runtime, MaxFrame docs/examples, DataFrame/Tensor operations, or GPU runtime setup. Works for both English and Chinese queries about Alibaba Cloud data processing with MaxFrame.

Backend & APIsscripts

rhdh-jira

Included

Interacts with RHDH Jira projects (RHIDP, RHDHPLAN, RHDHBUGS, RHDHSUPP) using acli, GraphQL, and REST API. Covers the full Jira lifecycle: create issues, assign, refine, plan sprints, report, track releases, and update status. Trigger on Jira keys (RHIDP-1234), "create a feature/epic/story/task/bug", "who should take this", "refine this", "plan the sprint", "sprint report", "release status", "update jira", or any sprint ceremony prep.

Backend & APIsscripts