Claude
Skills
Sign in
Back

video-insight

Included with Lifetime
$97 forever

Extract transcripts, generate summaries, create Q&A highlights, and perform deep research from YouTube videos or local media files. Use when the user provides a YouTube URL or local video/audio file path and asks to summarize, digest, analyze, or transcribe media content. Triggers: "video insight", "summarize video", "transcribe audio" + URL or file path.

Image & Videoscripts

What this skill does


# Video Insight

Analyzes YouTube videos or local media files to generate summaries,
insights, and optionally Q&A highlights to reinforce key learning points.

## Architecture

```mermaid
flowchart TB
    subgraph Main["Main Session"]
        SKILL[SKILL.md<br/>Orchestrator]
    end

    subgraph Agents["Subagents"]
        subgraph Haiku["Haiku Models"]
            QM[qa-generator<br/>Q&A Generation]
        end
        subgraph Sonnet["Sonnet Models"]
            TA[transcript-analyzer<br/>Transcript Analysis]
            DW[digest-writer<br/>Digest Writing]
            DR[deep-researcher<br/>Deep Research]
        end
    end

    SKILL --> TA
    SKILL --> DW
    SKILL --> QM
    SKILL --> DR

    TA -.->|Return Summary| SKILL
    DW -.->|Save Document| SKILL
    QM -.->|Q&A Section| SKILL
    DR -.->|Research Results| SKILL

    style Main fill:#f5f5f5,stroke:#333
    style Haiku fill:#e1f5fe,stroke:#0288d1
    style Sonnet fill:#fff3e0,stroke:#f57c00
```

**Context Management**: Main Session handles only orchestration.
Long transcript processing is performed by Subagents to protect context.

## Prerequisites

**YouTube URL Processing:**

- Requires `yt-dlp` (`brew install yt-dlp`)

**Local File Processing:**

- Requires `whisper-cpp` (`brew install whisper-cpp`)
- Requires `ffmpeg` (`brew install ffmpeg`)
- Whisper model download (automatic on first run)

Check dependencies: `./scripts/check_dependencies.sh`

## Supported Input Types

| Type          | Pattern              | Processing Method     |
|---------------|----------------------|-----------------------|
| YouTube URL   | `https://youtu.be/`  | Extract (yt-dlp)      |
| Video File    | `*.mp4`, `*.mov`     | whisper.cpp STT       |
| Audio File    | `*.mp3`, `*.m4a`     | whisper.cpp STT       |
| Subtitle File | `*.srt`, `*.vtt`     | Use directly          |

## Workflow

### Dependency Check (Before Starting)

**CRITICAL**: Check required dependencies before processing.
If missing, show installation guide and **stop immediately** (do not retry).

**For YouTube URL:**

```bash
./scripts/check_dependencies.sh --youtube
```

**For Local Media File:**

```bash
./scripts/check_dependencies.sh --local
```

**If exit code is 1 (missing dependencies):**

1. Display the script output (shows missing tools and install commands)
2. Inform user: "Please install the required dependencies and try again."
3. Reference: `references/prerequisites.md` for detailed installation guide
4. **Stop processing** - do not attempt to continue or retry

**Important**: Do not repeatedly check or retry installation.
The user must manually install dependencies and re-run the command.

### Step 0: Detect Input Type

Determine if input is YouTube URL or local file:

**YouTube URL Pattern:**

```regex
^https?://(www\.)?(youtube\.com|youtu\.be)
```

**Local File:**

- Check file existence (`[ -f "$INPUT" ]`)
- Determine type by extension

**Branching:**

- YouTube URL → Step 1A (YouTube metadata)
- Local media file → Step 1B (Local metadata)
- Subtitle file (srt/vtt) → Go directly to Step 3
- Invalid input → Error message

### Step 1A: Extract YouTube Metadata

```bash
./scripts/extract_metadata.sh "{youtube_url}"
```

Extract from JSON result:

- `title`, `channel`, `upload_date`, `duration`, `description`
- `chapters` (if available)
- `subtitles`, `automatic_captions` (subtitle availability)

### Step 1B: Extract Local File Metadata

```bash
./scripts/extract_local_metadata.sh "{file_path}"
```

Extract from JSON result:

- `title` (extracted from filename)
- `duration` (extracted with ffprobe)
- `format` (file format)
- `source: "local"` (local file indicator)

### Step 2: Check Video Duration

**If over 60 minutes**, present options with AskUserQuestion:

```yaml
question: "Video duration is {duration}. How would you like to proceed?"
options:
  - label: "Process entire video"
    description: "Process the full video (may take longer)"
  - label: "First 30 minutes only"
    description: "Process only the first 30 minutes"
  - label: "Cancel"
    description: "Cancel video processing"
```

### Step 3: Extract Transcript

**For YouTube URL:**

```bash
./scripts/extract_transcript.sh "{youtube_url}" "/tmp/video-insight"
```

Subtitle priority:
Korean manual > English manual > Korean auto > English auto

**If no subtitles available**, present options with AskUserQuestion:

```yaml
question: "No subtitles found. How would you like to proceed?"
options:
  - label: "Summarize description only"
    description: "Create a brief summary from the video description"
  - label: "Cancel"
    description: "Cancel video processing"
```

**For local media file:**

```bash
./scripts/extract_local_transcript.sh "{file_path}" "/tmp/video-insight"
```

Convert speech-to-text with whisper.cpp (Korean default)

**For existing subtitle file:**

Copy srt/vtt file to `/tmp/video-insight/` for use

### Step 4: Analyze Transcript (Subagent)

Call **transcript-analyzer** (Sonnet):

```markdown
Using Task tool:
- subagent_type: "transcript-analyzer"
- model: sonnet
- prompt: |
    Analyze the transcript file.

    - transcript_path: /tmp/video-insight/{title}.ko.srt
    - metadata: {metadata JSON}
    - language: ko

    Extract key content, timeline, and important quotes.
```

**Result**: Return only analysis results to main session (not entire transcript)

### Step 5: Confirm Save Path

Confirm save path with AskUserQuestion:

```yaml
question: "Where would you like to save the digest file?"
header: "Save path"
options:
  - label: "Default path"
    description: "outputs/video/{YYYY-MM-DD}__{title}.md"
  - label: "Current folder"
    description: "./{YYYY-MM-DD}__{title}.md"
  - label: "Custom path"
    description: "Specify a custom path"
```

**If custom path selected**: Request path input from user

### Step 6: Write Digest (Subagent)

Call **digest-writer** (Sonnet):

```markdown
Using Task tool:
- subagent_type: "digest-writer"
- model: sonnet
- prompt: |
    Write a digest document.

    - analysis_result: {Step 4 result}
    - metadata: {metadata}
    - output_path: {path confirmed in Step 5}
    - template_path: templates/video-insight.md

    Also perform proper noun correction and add background information.
```

**Result**: Markdown file saved confirmation message

### Step 7: Additional Content Options

Present options with AskUserQuestion (multiSelect enabled):

```yaml
question: "Would you like to add additional sections?"
header: "Options"
multiSelect: true
options:
  - label: "Q&A Section"
    description: "Add Q&A highlights (1-5 pairs based on content length)"
  - label: "Deep Research"
    description: "Conduct in-depth research with web search"
  - label: "Skip all"
    description: "Generate digest only without additional sections"
```

### Step 8: Generate Additional Content (Parallel Execution)

Based on user selection, execute agents in parallel.
Each agent returns content only (does not write to file).

**If Q&A selected**, call **qa-generator** (Haiku):

```markdown
Using Task tool:
- subagent_type: "qa-generator"
- model: haiku
- prompt: |
    Generate Q&A section content.

    - digest_path: {file path from Step 6}
    - qa_patterns_path: references/qa-patterns.md

    Create 1-5 Q&A pairs (based on content length)
    highlighting key information from the video.
    Return the Q&A section content in markdown format
    (do not write to file).
```

**If Deep Research selected**, call **deep-researcher** (Sonnet):

```markdown
Using Task tool:
- subagent_type: "deep-researcher"
- model: sonnet
- prompt: |
    Perform deep research.

    - digest_path: {file path from Step 6}
    - deep_research_reference: references/deep-research.md

    Collect related materials via web search.
    Return the Deep Research section content in markdown format
    (do not write to file).
```

**Parallel Execution**: If both options are selected,
launch both Task tools in a single message for parallel execution.

### Step 9: Append R

Related in Image & Video