Claude
Skills
Sign in
Back

resemble-detect

Included with Lifetime
$97 forever

Deepfake detection and media safety — detect AI-generated audio, images, video, and text, trace synthesis sources, apply watermarks, verify speaker identity, and analyze media intelligence using Resemble AI

Image & Video

What this skill does


# Resemble Detect — Deepfake Detection & Media Safety

Analyze audio, image, video, and text for synthetic manipulation, AI-generated content, watermarks, speaker identity, and media intelligence using the Resemble AI platform.

## Core Principle — THE IRON LAW

**"NEVER DECLARE MEDIA AS REAL OR FAKE WITHOUT A COMPLETED DETECTION RESULT."**

Do not guess, infer, or speculate about media authenticity. Every authenticity claim must be backed by a completed Resemble detect job with a returned `label`, `score`, and `status: "completed"`. If the detection is still `processing`, wait. If it `failed`, say so — do not substitute your own judgment.

## When to Use

Use this skill whenever the user's request involves any of these:

- Checking if audio, video, image, or text is AI-generated or manipulated
- Detecting deepfakes in any media format
- Verifying media authenticity or provenance
- Identifying which AI platform synthesized audio (source tracing)
- Applying or detecting watermarks on media
- Analyzing media for speaker info, emotion, transcription, or misinformation
- Asking natural-language questions about detection results
- Matching or verifying speaker identity against known voice profiles
- Detecting AI-generated or machine-written text
- Any mention of: "deepfake", "fake detection", "synthetic media", "voice verification", "watermark", "media forensics", "authenticity check", "source tracing", "is this real", "AI-written text", "text detection"

**Do NOT use** for text-to-speech generation, voice cloning, or speech-to-text transcription — those are separate Resemble capabilities.

## Capability Decision Tree

| User wants to...                                      | Use this                  | API endpoint                          |
|-------------------------------------------------------|---------------------------|---------------------------------------|
| Check if media is AI-generated / deepfake             | **Deepfake Detection**    | `POST /detect`                        |
| Know *which AI platform* made fake audio              | **Audio Source Tracing**  | `POST /detect` with flag              |
| Get speaker info, emotion, transcription from media   | **Intelligence**          | `POST /intelligence`                  |
| Ask questions about a completed detection             | **Detect Intelligence**   | `POST /detects/{uuid}/intelligence`   |
| Apply an invisible watermark to media                 | **Watermark Apply**       | `POST /watermark/apply`               |
| Check if media contains a watermark                   | **Watermark Detect**      | `POST /watermark/detect`              |
| Verify a speaker's identity against known profiles    | **Identity Search**       | `POST /identity/search`               |
| Check if text is AI-generated                         | **Text Detection**        | `POST /text_detect`                   |
| Create a voice identity profile for future matching   | **Identity Create**       | `POST /identity`                      |

When multiple capabilities apply (e.g., user wants deepfake detection AND intelligence), combine them in a single `POST /detect` call using the `intelligence: true` flag rather than making separate requests.

## Required Setup

- **API Key**: Bearer token from the Resemble AI dashboard (set as `RESEMBLE_API_KEY`)
- **Base URL**: `https://app.resemble.ai/api/v2`
- **Auth Header**: `Authorization: Bearer <RESEMBLE_API_KEY>`
- **Media Requirement**: All media must be at a publicly accessible HTTPS URL

If the user provides a local file path instead of a URL, inform them the file must be hosted at a public HTTPS URL first. Do not attempt to upload local files to the API. (Exception: `POST /text_detect` accepts text content inline.)

## MCP Tools Available

When the Resemble MCP server is connected, use these tools instead of raw API calls:

| Tool                      | Purpose                                           |
|---------------------------|---------------------------------------------------|
| `resemble_docs_lookup`    | Get comprehensive docs for any detect sub-topic   |
| `resemble_search`         | Search across all documentation                   |
| `resemble_api_endpoint`   | Get exact OpenAPI spec for any endpoint           |
| `resemble_api_search`     | Find endpoints by keyword                         |
| `resemble_get_page`       | Read specific documentation pages                 |
| `resemble_list_topics`    | List all available topics                         |

**Tool usage pattern**: Use `resemble_docs_lookup` with topic `"detect"` to get the full picture, then `resemble_api_endpoint` for exact request/response schemas before making API calls.

## Full API Reference

Detailed request/response schemas for every endpoint are in **[references/api-reference.md](references/api-reference.md)**. Consult it before making any API call to verify exact parameter names and response shapes. The sections below cover decision-making; the reference covers exact field formats.

---

## Phase 1: Deepfake Detection

The core capability. Submit audio, image, or video for AI-generated content analysis via `POST /detect`.

**Key flags to consider:**
- `visualize: true` — generate heatmap/visualization artifacts
- `intelligence: true` — run multimodal intelligence alongside detection (saves a round-trip)
- `audio_source_tracing: true` — identify which AI platform synthesized fake audio (only fires on `"fake"` audio)
- `use_reverse_search: true` — enable reverse image search (image only)
- `zero_retention_mode: true` — auto-delete media after analysis (for sensitive content)

Detection is asynchronous. Poll `GET /detect/{uuid}` at 2s → 5s → 10s intervals until `status` is `"completed"` or `"failed"`. Most complete in 10–60 seconds.

**Supported formats:** Audio (WAV, MP3, OGG, M4A, FLAC) · Video (MP4, MOV, AVI, WMV) · Image (JPG, PNG, GIF, WEBP)

### Reading Results

- **Audio** — verdict in `metrics` — use `label` and `aggregated_score`
- **Image** — verdict in `image_metrics` — use `label` and `score`; `ifl` has an Invisible Frequency Layer heatmap
- **Video** — verdict in `video_metrics` — hierarchical tree of frame/segment results; video-with-audio returns both `metrics` and `video_metrics`

See [references/api-reference.md](references/api-reference.md#reading-results-by-media-type) for full response schemas.

### Interpreting Scores

| Score Range | Interpretation                                      |
|-------------|-----------------------------------------------------|
| 0.0 – 0.3  | Strong indication of authentic/real media            |
| 0.3 – 0.5  | Inconclusive — recommend additional analysis         |
| 0.5 – 0.7  | Likely synthetic — flag for review                   |
| 0.7 – 1.0  | High confidence synthetic/AI-generated               |

**Always present scores with context.** Say "The detection returned a score of 0.87, indicating high confidence that this audio is AI-generated" — never just "it's fake."

---

## Phase 2: Intelligence — Media Analysis

Rich structured insights about media: speaker info, emotion, transcription, translation, misinformation, abnormalities.

Two ways to run Intelligence:
1. **Combined with detection** — add `intelligence: true` to `POST /detect` (preferred; one call)
2. **Standalone** — `POST /intelligence` with a URL (when you only need analysis, not a deepfake verdict)

**Audio/video structured fields include:** `speaker_info`, `language`, `dialect`, `emotion`, `speaking_style`, `context`, `message`, `abnormalities`, `transcription`, `translation`, `misinformation`.

**Image structured fields include:** `scene_description`, `subjects`, `authenticity_analysis`, `context_and_setting`, `abnormalities`, `misinformation`.

### Detect Intelligence — Ask Questions About Results

After a detection completes, ask natural-language questions via `POST /detects/{detect_uuid}/intelligence` with `{ "query": "..." }`. Returns a question UUID — poll 

Related in Image & Video