pixverse-ai-image-and-video-generator

Included with Lifetime

$97 forever

PixVerse CLI — generate AI videos and images from the command line. Supports PixVerse V6, Veo, Sora, Grok, Seedance, Kling, Happy Horse video models; Nano Banana (Gemini), Seedream, Qwen, Kling, GPT Image image models; and PixVerse's rich effect template library. Start here.

Image & Videoscripts

What this skill does


# PixVerse CLI — Master Skill

## What is PixVerse CLI

PixVerse CLI is the official command-line interface for [PixVerse](https://pixverse.ai) — an AI-powered creative platform for generating videos and images. It is essentially **a UI-free version of the PixVerse website**: all features, models, and parameters are aligned with the web experience at [app.pixverse.ai](https://app.pixverse.ai).

It is designed for:
- **AI agents** (primary) — structured JSON output, deterministic exit codes, and pipeable commands for autonomous workflows (Claude Code, Cursor, Codex, custom agents)
- **Developers & power users** — scriptable video/image generation without leaving the terminal
- **Automation** — batch processing, CI/CD pipelines, content production workflows

Key facts:
- Generating content **consumes credits** from the user's PixVerse account (same pricing as the website)
- **Only subscribed users** can use the CLI — see [subscription plans](https://app.pixverse.ai/subscribe)
- All output can be returned as structured JSON via `--json` flag
- English only

---

## Installation

```bash
npm install -g pixverse
```

Or run without installing:
```bash
npx pixverse
```

Verify:
```bash
pixverse --version
```

**Requires Node.js >= 20.**

---

## Quick Start

```bash
# 1. Install
npm install -g pixverse

# 2. Authenticate (OAuth device flow — opens browser)
pixverse auth login --json

# 3. Create a video (waits for completion by default)
RESULT=$(pixverse create video --prompt "A cat astronaut floating in space" --json)
VIDEO_ID=$(echo "$RESULT" | jq -r '.video_id')

# 4. Download the result
pixverse asset download $VIDEO_ID --json
```

To skip waiting and poll later:
```bash
RESULT=$(pixverse create video --prompt "A cat astronaut floating in space" --no-wait --json)
VIDEO_ID=$(echo "$RESULT" | jq -r '.video_id')
pixverse task wait $VIDEO_ID --json
pixverse asset download $VIDEO_ID --json
```

> **Windows users**: For a full PowerShell pipeline example (T2I → I2V → upscale → download), see `skills/examples/windows/powershell-text-to-video.ps1`.

---

## Authentication

PixVerse CLI uses **OAuth device flow** — no need to manually copy tokens:

1. Run `pixverse auth login --json`
2. The CLI prints an authorization URL
3. Open the URL in your browser and authorize
4. The token is stored automatically in `~/.pixverse/`

Details:
- Token is valid for 30 days
- CLI sessions are independent from your web/app sessions
- If token expires (exit code 3), re-run `pixverse auth login --json`
- Run `pixverse auth status --json` to check login state and credits

---

## Capabilities Overview

| I want to... | Use skill |
|:---|:---|
| Create a video from text or image | `pixverse:create-video` |
| Enhance a video prompt for better results (V6 / generic) | `pixverse:prompt-enhance` |
| Optimize a prompt for Seedance 2.0 (auto-triggers when prompt has clear optimization headroom; skipped when prompt is already clean) | `pixverse:seedance-prompt-optimize` |
| Edit video content with AI (replace subjects, swap outfits, change backgrounds) | `pixverse:modify-video` |
| Animate a character with motion from a reference video | `pixverse:motion-control` |
| Create or edit an image | `pixverse:create-and-edit-image` |
| Extend, upscale, or add audio to a video | `pixverse:post-process-video` |
| Create transition animation between frames | `pixverse:transition` |
| Check generation progress | `pixverse:task-management` |
| Browse, download, upload, or delete assets | `pixverse:asset-management` |
| Organize assets into named folders | `pixverse:saved-folders` |
| Set up auth or check account | `pixverse:auth-and-account` |
| Browse and create from effect templates | `pixverse:template` |
| Manage workspaces (list, switch, status) | `pixverse:workspace` |
| Generate Mondo-style posters and covers | `pixverse:mondo-poster-design` |
| Design and reuse persistent characters across a story | `pixverse:character-design` |
| Design and reuse persistent key items / props / objects | `pixverse:item-design` |

> **Looking up models or parameters?** Don't wait until you're generating — read the relevant capabilities file directly:
> - Video models & constraints → `skills/capabilities/create-video.md` (Model Reference section)
> - Image models & constraints → `skills/capabilities/create-and-edit-image.md` (Model Reference section)

---

## Model Quick Reference

Use this to pick a model before diving into a sub-skill.

### Video Models (`pixverse create video --model <value>`)

| Model | `--model` value | Max Quality | Duration |
|:---|:---|:---|:---|
| PixVerse V6 *(default)* | `v6` | `1080p` | `1`–`15`s |
| PixVerse C1 | `pixverse-c1` | `1080p` | `1`–`15`s |
| PixVerse v5.6 | `v5.6` | `1080p` | `1`–`10`s |
| Sora 2 | `sora-2` | `720p` | `4` `8` `12`s |
| Sora 2 Pro | `sora-2-pro` | `1080p` | `4` `8` `12`s |
| Veo 3.1 Standard | `veo-3.1-standard` | `1080p` | `4` `6` `8`s |
| Veo 3.1 Fast | `veo-3.1-fast` | `1080p` | `4` `6` `8`s |
| Veo 3.1 Lite | `veo-3.1-lite` | `1080p` | `4`–`6`s |
| Grok Imagine | `grok-imagine` | `720p` | `1`–`15`s |
| Happy Horse 1.0 | `happyhorse-1.0` | `1080p` | `3`–`15`s |
| Seedance 2.0 Standard | `seedance-2.0-standard` | `1080p` | `4`–`15`s |
| Seedance 2.0 Fast | `seedance-2.0-fast` | `720p` | `4`–`15`s |
| Kling O3 Pro | `kling-o3-pro` | `720p` | `3`–`15`s |
| Kling O3 Standard | `kling-o3-standard` | `720p` | `3`–`15`s |
| Kling 3.0 Pro | `kling-3.0-pro` | `720p` | `3`–`15`s |
| Kling 3.0 Standard | `kling-3.0-standard` | `720p` | `3`–`15`s |

### Image Models (`pixverse create image --model <value>`)

| Model | `--model` value | Max Quality |
|:---|:---|:---|
| Qwen Image *(default)* | `qwen-image` | `1080p` |
| GPT Image 2 | `gpt-image-2.0` | `2160p` |
| Seedream 5.0 Lite | `seedream-5.0-lite` | `2160p` |
| Seedream 4.5 | `seedream-4.5` | `2160p` |
| Seedream 4.0 | `seedream-4.0` | `2160p` |
| Gemini 2.5 Flash (Nanobanana) | `gemini-2.5-flash` | `1080p` |
| Gemini 3.0 (Nano Banana Pro) | `gemini-3.0` | `2160p` |
| Gemini 3.1 Flash (Nano Banana 2) | `gemini-3.1-flash` | `2160p` |
| Kling Image O3 | `kling-image-o3` | `2160p` |
| Kling Image V3 | `kling-image-v3` | `1440p` |

For full parameter constraints (aspect ratios, quality per model, mode support), read the capabilities files listed above.

---

## Workflow Skills

| I want to... | Use skill |
|:---|:---|
| Generate video from text end-to-end | `pixverse:text-to-video-pipeline` |
| Animate an image into video | `pixverse:image-to-video-pipeline` |
| Generate image then animate it | `pixverse:text-to-image-to-video` |
| Iteratively edit an image | `pixverse:image-editing-pipeline` |
| Modify a video and enhance it | `pixverse:modify-video-pipeline` |
| Full video production (create + extend + audio + upscale) | `pixverse:video-production` |
| Animate a character with a motion reference | `pixverse:motion-control-pipeline` |
| Create multiple items in parallel | `pixverse:batch-creation` |
| Generate a Mondo-style poster end-to-end | `pixverse:mondo-poster-pipeline` |
| Generate poster then animate into video | `pixverse:mondo-poster-to-video-pipeline` |
| Storyboard → 4-shot video from a single prompt | `pixverse:storyboard-to-video` |

---

## Reference Materials

Located in `skills/references/`. These are read-only knowledge bases that capabilities and workflows draw from — no CLI commands, just curated design knowledge.

| Reference | Path | Content |
|:---|:---|:---|
| Mondo Artist Styles | `references/mondo-poster/artist-styles.md` | 37 artist styles with prompt keywords across 7 categories |
| Mondo Composition Patterns | `references/mondo-poster/composition-patterns.md` | 8 composition techniques (negative space, silhouette, geometric framing, etc.) |
| Mondo Genre Templates | `references/mondo-poster/genre-templates.md` | Genre-specific prompt templates for film, book covers, and album covers |

---

## All Commands

| Command | Description |
|:---|:---|
| `auth login` | Login via browser (

Files: 35

Size: 256.6 KB

Complexity: 86/100

Category: Image & Video

Source: https://github.com/pixverseai/skills/tree/main/skills

Related in Image & Video

watch

Included

Watch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.

Image & Videoscriptsfeatured

physical-ai-defect-image-generation

Included

Use when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.

Image & Videoscripts

accelint-react-best-practices

Included

React performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.

Image & Videoscripts

elevenlabs-agents

Included

Build conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication

Image & Videoscripts

humanizer

Included

Humanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.

Image & Videoscripts

generating-mermaid-diagrams

Included

Salesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.

Image & Videoscripts