kling-3-prompting

Included with Lifetime

$97 forever

Write better prompts for Kling 3.0 AI video generation. Use when the user wants to create, write, improve, or refine prompts — text-to-video, image-to-video, keyframes, multi-shot sequences, or dialogue scenes.

Image & Video

What this skill does


## Overview

Kling 3.0 is a unified multimodal video model. It understands **cinematic direction**, not keyword lists. Write prompts like a director — describe what the audience sees, hears, and feels over time.

**Core shift:** Description → Direction. Think "direct a scene" not "describe an image."

## Interactive Builder Workflow

When invoked, guide the user through these steps using `AskUserQuestion`:

```dot
digraph builder {
  "1. Generation mode?" [shape=diamond];
  "Text-to-Video" [shape=box];
  "Image-to-Video" [shape=box];
  "Multi-Shot Sequence" [shape=box];
  "Keyframe Transition" [shape=box];
  "2. Gather scene details" [shape=box];
  "3. Assemble prompt" [shape=box];
  "4. Present & refine" [shape=box];

  "1. Generation mode?" -> "Text-to-Video";
  "1. Generation mode?" -> "Image-to-Video";
  "1. Generation mode?" -> "Multi-Shot Sequence";
  "1. Generation mode?" -> "Keyframe Transition";
  "Text-to-Video" -> "2. Gather scene details";
  "Image-to-Video" -> "2. Gather scene details";
  "Multi-Shot Sequence" -> "2. Gather scene details";
  "Keyframe Transition" -> "2. Gather scene details";
  "2. Gather scene details" -> "3. Assemble prompt";
  "3. Assemble prompt" -> "4. Present & refine";
}
```

### Step 1: Determine Generation Mode

Ask the user which mode:
- **Text-to-Video** — prompt from scratch
- **Image-to-Video** — animate a reference image
- **Multi-Shot Sequence** — 2-6 shot storyboard (up to 15s)
- **Keyframe Transition** — start frame → end frame with interpolated motion

### Step 2: Gather Scene Details

Ask about each element (adapt questions to mode):

| Element | Question | Why it matters |
|---------|----------|----------------|
| **Subject** | Who/what is the focus? Specific appearance details? | Anchors consistency — define distinguishing traits early |
| **Action** | What happens? Describe the timeline (first → then → finally) | Kling 3.0 excels at sequential action over 15s arcs |
| **Environment** | Where? Be specific (not "a street" but "narrow Tokyo alley, steam from grates") | Grounds the scene physically |
| **Camera** | Shot type and movement? (See camera reference below) | Cinematic language produces far better results |
| **Lighting** | What light sources? Name them specifically | "Flickering neon" beats "dramatic lighting" |
| **Mood/Emotion** | What should the audience feel? | Drives color grade, pacing, music |
| **Audio** | Dialogue? Ambient sound? Music? | Kling 3.0 generates native audio + lip-sync |
| **Duration** | How long? (3-15s) | Longer = describe progression over time |
| **Aspect Ratio** | 16:9 / 9:16 / 1:1 / 21:9? | 16:9 cinematic, 9:16 social, 21:9 ultra-wide |

**Image-to-Video:** Focus on how the scene *evolves from* the image — movement, camera motion, environmental change. The model preserves identity/layout from the source.

**Keyframes:** Ask for start and end frame descriptions. Frames should match in color, style, and lighting. Prompt sparingly — Kling infers motion well.

**Multi-Shot:** Define each shot separately with its own framing, subject, action, and duration. Label shots explicitly.

### Step 3: Assemble the Prompt

Use the **Master Formula**:

```
[Scene/Environment] + [Subject & Appearance] + [Action Timeline] + [Camera Movement] + [Audio & Atmosphere] + [Technical Specs]
```

**Writing rules:**
- Use cinematic motion verbs: dolly push, whip-pan, crash zoom, rack focus, tracking shot — NOT "moves" or "goes"
- Name real light sources: neon signs, candlelight, golden hour, LED panels — NOT "dramatic lighting"
- Include texture for credibility: grain, lens flares, condensation, fabric sheen, smoke, sweat
- Describe temporal flow: beginning → middle → end
- Keep to 1-3 rich sentences per shot (specificity > length)
- For dialogue: use character labels, assign voice tone/emotion, use transitional words ("Immediately," "Pause")

### Step 4: Present & Refine

Present the assembled prompt. Ask if they want to:
- Adjust any element
- Add a negative prompt
- Generate variations (different duration, different camera, different mood)

## Quick Reference

### Camera Movements

| Movement | Effect | Example phrase |
|----------|--------|---------------|
| Dolly push-in | Builds intimacy/tension | "slow dolly push-in toward her face" |
| Dolly zoom | Vertigo/dramatic reveal | "dolly zoom creating disorienting depth shift" |
| Tracking shot | Follows subject laterally | "camera tracks alongside as she walks" |
| Whip-pan | Energy/surprise | "whip-pan to reveal the door" |
| Crash zoom | Shock/emphasis | "sudden crash zoom on the object" |
| Rack focus | Shift attention | "rack focus from foreground hand to background figure" |
| Handheld/shoulder-cam | Raw/documentary feel | "handheld shoulder-cam with subtle sway" |
| Static tripod | Composed/observational | "locked-off static tripod, wide shot" |
| FPV drone | High-energy immersion | "dynamic FPV drone shot chasing through corridor" |
| Low-angle tracking | Heroic/imposing | "low-angle tracking shot, subject towers above" |
| Truck left/right | Lateral reveal | "camera trucks right revealing the cityscape" |
| Tilt up/down | Vertical reveal | "slow tilt up from boots to face" |

### Lens & Film Stock

| Phrase | Effect |
|--------|--------|
| "Shot on 35mm film" | Warm grain, organic texture |
| "Macro 85mm lens" | Tight detail, shallow depth of field |
| "Wide-angle steadicam" | Smooth, immersive, spatial |
| "Handheld camcorder" | Raw VHS energy, nostalgic |
| "Anamorphic lens flare" | Cinematic horizontal streaks |

### Lighting

Use specific sources, not adjectives:
- "Golden hour sun cutting through dusty warehouse windows"
- "Flickering neon casting magenta/cyan across wet pavement"
- "Single bare bulb swinging, casting moving shadows"
- "Cool blue LED panels reflecting off glass surfaces"
- "Candlelight warming skin tones, deep shadows beyond"

### Color & Grade

- "Desaturated teal grade, crushed blacks"
- "Amber nightclub strobe cutting through smoke"
- "Cool blue haze filling the corridor"
- "Magenta neon reflecting off wet asphalt"
- "Overexposed highlights, blown-out whites"

## Multi-Character Dialogue

| Rule | Do | Don't |
|------|-----|-------|
| Name characters | `[Character A: Silver-haired CEO]` | `[Man] says...` |
| Anchor to action | *Agent slams table.* **[Agent, angrily]:** "Where is it?" | Just dialogue without visual action |
| Assign voice tone | `[CEO, deep authoritative gravelly voice]` | Generic "says" |
| Control timing | "Immediately," "Pause," "After a beat" | Back-to-back dialogue without transitions |

## Multi-Shot Structure

```
Shot 1 (0-5s): [Wide establishing shot description]
Shot 2 (5-10s): [Medium/close-up with action progression]
Shot 3 (10-15s): [Resolution/reaction with camera payoff]

Atmosphere: [Overall mood, color grade]
Audio: [Sound design, music, dialogue]
```

Label every shot. Assign durations. Describe framing + subject + motion per shot.

## Start & End Frame Tips

- Frames should match in color palette, style, and lighting
- Identical start/end frames = seamless loop
- Prompt sparingly — Kling infers motion between frames well
- Simple camera directions: zoom in/out, pan left/right, tilt up/down
- 5s for dynamic transitions, 10s for complex transformations
- Start frame aspect ratio drives the whole clip

## Negative Prompts

Use to prevent common AI defaults:

```
smiling, laughing, cartoonish, bright saturated colors, low resolution,
morphing, blurry text, disfigured hands, extra fingers, static pose,
frozen expression, stock photo aesthetic
```

Customize based on scene — remove items that conflict with your intent.

## Weak → Strong

| Element | Weak | Strong |
|---------|------|--------|
| Camera | "Camera follows person" | "Handheld shoulder-cam drifts behind subject with subtle sway" |
| Subject | "A woman walking" | "Woman in red dress, heels clicking wet cobblestone" |
| Environment | "In a city" | "Narrow Tokyo alley, steam from grates, glowing vending machines" |
|

Files: 1

Size: 9.2 KB

Complexity: 18/100

Category: Image & Video

Source: https://github.com/aedev-tools/kling-3-prompting-skill/tree/main/skills/kling-3-prompting

Related in Image & Video

watch

Included

Watch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.

Image & Videoscriptsfeatured

physical-ai-defect-image-generation

Included

Use when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.

Image & Videoscripts

accelint-react-best-practices

Included

React performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.

Image & Videoscripts

elevenlabs-agents

Included

Build conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication

Image & Videoscripts

humanizer

Included

Humanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.

Image & Videoscripts

generating-mermaid-diagrams

Included

Salesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.

Image & Videoscripts