ai-image-generation
Generate and edit images on RunComfy via the `runcomfy` CLI — a smart router across the full image-model catalog: FLUX 2 (Klein 9B/4B, Pro, Dev, Flash, Turbo, Max), Google Nano Banana 2 / Pro, OpenAI GPT Image 2, ByteDance Seedream 5 / 4-5 / 4-0 and Dreamina 4-0, Alibaba Qwen Image and Z-Image Turbo, Wan 2-7. Covers both text-to-image (t2i) and image-to-image / edit (i2i) endpoints — the skill picks the right model for the user's actual intent (typography precision, photoreal portraits, sub-second iteration, multi-reference brand styling, open-weights workflow) and ships each model's documented prompting patterns plus the minimal `runcomfy run` invoke. Triggers on "generate image", "make a picture", "text to image", "AI image", "make an image of …", "image to image", "i2i", or any explicit ask to create or restyle an image.
What this skill does
# AI Image Generation
Generate and edit images with 11+ AI models via the [RunComfy](https://www.runcomfy.com/?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation) CLI — text-to-image and image-to-image, one auth, one command. This skill picks the right model for the user's intent and ships the documented prompt patterns + the exact `runcomfy run` invoke for each.
[runcomfy.com](https://www.runcomfy.com/?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation) · [Browse all models](https://www.runcomfy.com/models?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation) · [CLI docs](https://docs.runcomfy.com/cli/introduction?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation)
## Powered by the RunComfy CLI
```bash
# 1. Install (one of — see runcomfy-cli skill for details)
npm i -g @runcomfy/cli # global install
npx -y @runcomfy/cli --version # zero-install
# 2. Sign in (interactive — opens browser)
runcomfy login
# or in CI / containers:
export RUNCOMFY_TOKEN=<token-from-runcomfy.com/profile>
# 3. Generate
runcomfy run <vendor>/<model>/<endpoint> \
--input '{"prompt": "..."}' \
--output-dir ./out
```
CLI docs: [Install](https://docs.runcomfy.com/cli/install?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation) · [Quickstart](https://docs.runcomfy.com/cli/quickstart?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation) · [Commands](https://docs.runcomfy.com/cli/commands?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation) · [Auth](https://docs.runcomfy.com/cli/auth?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation) · [Troubleshooting](https://docs.runcomfy.com/cli/troubleshooting?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation)
## Install this skill
```bash
npx skills add agentspace-so/runcomfy-agent-skills --skill ai-image-generation -g
```
---
## Pick the right model for the user's intent
### Text-to-image (t2i) — newest first
**FLUX 2 Klein 9B** — `blackforestlabs/flux-2-klein/9b/text-to-image` *(default)*
> Step-distilled, 4–25 steps, native multi-reference conditioning, strong photoreal + illustration all-rounder.
> Pick for: intent unclear, fast iteration, multi-ref styling, general-purpose.
> Avoid for: in-image text — use **GPT Image 2**.
**FLUX 2 Klein 4B** — `blackforestlabs/flux-2-klein/4b/text-to-image`
> Sub-second variant of Klein 9B, same field set.
> Pick for: storyboard, moodboard, batch concepting at speed.
> Avoid for: final delivery — slight quality drop vs 9B.
**FLUX 2 Pro / Dev / Flash / Turbo / Max** — `blackforestlabs/flux-2/max`, [`flux-2-dev`](https://www.runcomfy.com/models/blackforestlabs/flux-2-dev/text-to-image?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation), [`flux-2-flash`](https://www.runcomfy.com/models/blackforestlabs/flux-2-flash?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation), [`flux-2-turbo`](https://www.runcomfy.com/models/blackforestlabs/flux-2-turbo?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation)
> Higher-fidelity tiers of the FLUX 2 base. Cinematic + brand work, hero shots.
> Pick for: production polish, brand campaigns.
> Avoid for: sub-second speed — use **Klein 4B**.
**Nano Banana Pro** — [`google/nano-banana-pro/text-to-image`](https://www.runcomfy.com/models/google/nano-banana-pro/text-to-image?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation)
> Highest-quality Nano Banana tier. Gemini-grounded, optional web search for real-world references (products, landmarks).
> Pick for: NB-style instruction-following at higher fidelity.
> Avoid for: cost-sensitive iteration — drop to **Nano Banana 2**.
**Nano Banana 2** — `google/nano-banana-2/text-to-image`
> Flash-tier latency, predictable framing, `enable_web_search` flag for real-product / real-person grounding.
> Pick for: speed iteration, 4-up batch, real-world grounded prompts.
> Avoid for: long compositional instructions — use **GPT Image 2**.
**GPT Image 2** — `openai/gpt-image-2/text-to-image`
> Best-in-class in-image text rendering (Japanese kana, Cyrillic, Arabic). Layout-precise instruction following.
> Pick for: posters, ads, multi-line copy, multilingual creatives, exact-text headlines.
> Avoid for: photoreal portraits — **Seedream 5** wins on skin tones and lighting.
**Seedream 5 Lite** — [`bytedance/seedream-5/lite/text-to-image`](https://www.runcomfy.com/models/bytedance/seedream-5/lite/text-to-image?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation)
> Latest ByteDance Seedream tier. Photoreal skin tones, natural lighting, strong East Asian aesthetic.
> Pick for: photoreal portraits, product shots, fashion / lifestyle.
> Avoid for: typography precision — use **GPT Image 2**.
**Seedream 4-5** — [`bytedance/seedream-4-5/text-to-image`](https://www.runcomfy.com/models/bytedance/seedream-4-5/text-to-image?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation)
> Previous Seedream flagship, still strong on photoreal.
> Pick for: identity-stable batches between Seedream-5 generations; cheaper Seedream tier.
> Avoid for: new work — prefer **Seedream 5 Lite**.
**Dreamina 4-0** — [`bytedance/dreamina-4-0/text-to-image`](https://www.runcomfy.com/models/bytedance/dreamina-4-0/text-to-image?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation)
> ByteDance illustration / concept-art lean, stylized characters.
> Pick for: concept art, illustrated heroes, painterly assets.
> Avoid for: photoreal — use **Seedream**.
**Qwen Image 2512** — [`qwen/qwen-image/qwen-image-2512`](https://www.runcomfy.com/models/qwen/qwen-image/qwen-image-2512?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation)
> Alibaba Qwen latest, open-weights, LoRA-compatible (`/lora` variant).
> Pick for: open-weights workflow, Qwen-aligned LoRA chains.
> Avoid for: closed-weights polish — use **FLUX 2** or **GPT Image 2**.
**Wan 2-7** — [`wan-ai/wan-2-7/text-to-image`](https://www.runcomfy.com/models/wan-ai/wan-2-7/text-to-image?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation), [`wan-ai/wan-2-7/pro/text-to-image`](https://www.runcomfy.com/models/wan-ai/wan-2-7/pro/text-to-image?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation)
> Open-weights, pairs natively with Wan 2-7 video models for unified-stack workflows.
> Pick for: Wan-stack pipelines (image + video same brand), open-weights requirement.
> Avoid for: top-tier image-only quality.
**Z-Image Turbo** — [`tongyi-mai/z-image/turbo`](https://www.runcomfy.com/models/tongyi-mai/z-image/turbo?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation)
> Sub-second open-weights, native LoRA `/lora` variant.
> Pick for: LoRA-customized open-weights workflow at speed.
> Avoid for: closed-weights polish.
### Image-to-image / edit (i2i) — newest first
**Nano Banana Pro Edit** — [`google/nano-banana-pro/edit`](https://www.runcomfy.com/models/google/nano-banana-pro/edit?utm_source=skills.sh&utm_medium=skill&utm_campaign=ai-image-generation)
> Highest-quality Nano Banana edit tier. Identity-preserving, multi-ref.
> Pick for: premium NB edit work, identity-locked variants.
> Avoid for: cost-sensitive iteration — drop to **Nano Banana 2 Edit**.
**Nano Banana 2 Edit** — `google/nano-banana-2/edit` *(default i2i)*
> 1–20 input images per call, identity-preserving by default, spatial-language honored ("upper-right", "the left object").
> Pick for: default i2i, batch identity-preserving, background swap, directional object remove/add.
> Avoid for: precise mask region — use the [`image-edit`](https://www.skills.sh/agentspace-so/runcomfy-agent-skills/image-edit) skill (Z-Image Inpaint).
**GPT Image 2 Edit** — `openai/gpt-image-2/edit`
> Up to 10 reference images, multilingual in-image text rewrite, Related in Ads & Marketing
ads
IncludedMulti-platform paid advertising audit and optimization skill. Analyzes Google, Meta, YouTube, LinkedIn, TikTok, Microsoft, and Apple Ads. 250+ checks with scoring, parallel agents, industry templates, and AI creative generation.
banana
IncludedAI image generation Creative Director powered by Google Gemini Nano Banana models. Use this skill for ANY request involving image creation, editing, visual asset production, or creative direction. Triggers on: generate an image, create a photo, edit this picture, design a logo, make a banner, visual for my anything, and all /banana commands. Handles text-to-image, image editing, multi-turn creative sessions, batch workflows, and brand presets.
rpg-migration-analyzer
IncludedAnalyzes legacy RPG (Report Program Generator) programs from AS/400 and IBM i systems for migration to modern Java applications. Extracts business logic from RPG III/IV/ILE source code, identifies data structures (D-specs), file operations (F-specs), program dependencies (CALLB/CALLP), and converts RPG constructs to Java equivalents. Generates migration reports, complexity estimates, and Java implementation strategies with POJO classes, JPA entities, and service methods. Use when modernizing AS/400 or IBM i legacy systems, analyzing RPG source files (.rpg, .rpgle, .RPGLE), converting RPG to Java, mapping data specifications to Java classes, planning legacy system migration, or when user mentions RPG analysis, Report Program Generator, RPG III/IV/ILE, AS/400 modernization, IBM i migration, packed decimal conversion, or mainframe application rewrite.
brand-library-architect
IncludedBuild a complete brand library for a product — visual asset render pipeline, brand documentation set (BRAND, COPY, MANIFESTO, BIOS, FAQ, GLOSSARY, TONE, PRICING), open-source convention files (README, CONTRIBUTING, SECURITY, CODE_OF_CONDUCT), and a self-contained press kit. This skill should be used when the user asks to "build a brand library / brand kit / press kit / brand assets" for a product, "set up a brand library workflow," "create a positioning manifesto plus visual identity," or any combination of brand documentation + visual asset pipeline. Apply phase-by-phase or run end-to-end. Templates are product-agnostic and use {{TOKEN}} placeholders the skill prompts the user to fill.
writing-tech-post
IncludedAuthors engineering blog posts end-to-end: launch deep-dives, incident postmortems, architecture migrations, performance case studies, tutorials, AI/agent system writeups, security disclosures, and research-to-product translations. Picks the correct archetype, plans the abstraction ladder, enforces an evidence cadence (diagrams, benchmarks, profiles, traces, code, ablations), tunes voice against publisher house styles (Datadog, Vercel, GitHub, AWS, Meta, Cloudflare, Jane Street), and runs a pre-publish gate for narrative momentum and disclosure ethics. Use when drafting a new engineering post, restructuring a draft that feels flat, deciding which evidence form belongs where, validating that depth and product context are balanced, or preparing a postmortem, migration, or performance narrative for external publication. Do not use for API reference documentation, README authoring, marketing copy, release notes, generic SEO content, ghost-written executive thought leadership, or non-engineering long-form essays.
blog-google
IncludedGoogle API integration for blog performance: PageSpeed Insights, CrUX Core Web Vitals with 25-week history, Search Console performance, URL Inspection, Indexing API, GA4 organic traffic, NLP entity analysis for E-E-A-T, YouTube video search for embedding, and Google Ads Keyword Planner. Progressive feature availability based on credential tier (API key, OAuth/service account, GA4, Ads). Shares config with claude-seo at ~/.config/claude-seo/google-api.json. Use when user says "google data", "page speed", "core web vitals", "search console", "indexation", "GA4", "keyword research", "nlp entities", "blog performance", "youtube search", "google api setup".