comfyui-gateway
REST API gateway for ComfyUI servers. Workflow management, job queuing, webhooks, caching, auth, rate limiting, and image delivery (URL + base64).
What this skill does
# ComfyUI Gateway
## Overview
REST API gateway for ComfyUI servers. Workflow management, job queuing, webhooks, caching, auth, rate limiting, and image delivery (URL + base64).
## When to Use This Skill
- When the user mentions "comfyui" or related topics
- When the user mentions "comfy ui" or related topics
- When the user mentions "stable diffusion api gateway" or related topics
- When the user mentions "gateway comfyui" or related topics
- When the user mentions "api gateway imagens" or related topics
- When the user mentions "queue imagens" or related topics
## Do Not Use This Skill When
- The task is unrelated to comfyui gateway
- A simpler, more specific tool can handle the request
- The user needs general-purpose assistance without domain expertise
## How It Works
A production-grade REST API gateway that transforms any ComfyUI server into a universal,
secure, and scalable service. Supports workflow templates with placeholders, job queuing
with priorities, webhook callbacks, result caching, and multiple storage backends.
## Architecture Overview
```
┌─────────────┐ ┌──────────────────────────────────┐ ┌──────────┐
│ Clients │────▶│ ComfyUI Gateway │────▶│ ComfyUI │
│ (curl, n8n, │ │ │ │ Server │
│ Claude, │ │ ┌─────────┐ ┌──────────────┐ │ │ (local/ │
│ Lovable, │ │ │ Fastify │ │ BullMQ Queue │ │ │ remote) │
│ Supabase) │ │ │ API │──│ (or in-mem) │ │ └──────────┘
│ │◀────│ └─────────┘ └──────────────┘ │
│ │ │ ┌─────────┐ ┌──────────────┐ │ ┌──────────┐
│ │ │ │ Auth + │ │ Storage │ │────▶│ S3/MinIO │
│ │ │ │ RateL. │ │ (local/S3) │ │ │(optional)│
│ │ │ └─────────┘ └──────────────┘ │ └──────────┘
└─────────────┘ └──────────────────────────────────┘
```
## Components
| Component | Purpose | File(s) |
|-----------|---------|---------|
| **API Gateway** | REST endpoints, validation, CORS | `src/api/` |
| **Worker** | Processes jobs, talks to ComfyUI | `src/worker/` |
| **ComfyUI Client** | HTTP + WebSocket to ComfyUI | `src/comfyui/` |
| **Workflow Manager** | Template storage, placeholder rendering | `src/workflows/` |
| **Storage Provider** | Local disk + S3-compatible | `src/storage/` |
| **Cache** | Hash-based deduplication | `src/cache/` |
| **Notifier** | Webhook with HMAC signing | `src/notifications/` |
| **Auth** | API key + JWT + rate limiting | `src/auth/` |
| **DB** | SQLite (better-sqlite3) or Postgres | `src/db/` |
| **CLI** | Init, add-workflow, run, worker | `src/cli/` |
## Quick Start
```bash
## 1. Install
cd comfyui-gateway
npm install
## 2. Configure
cp .env.example .env
## 3. Initialize
npx tsx src/cli/index.ts init
## 4. Add A Workflow
npx tsx src/cli/index.ts add-workflow ./workflows/sdxl_realism_v1.json \
--id sdxl_realism_v1 --schema ./workflows/sdxl_realism_v1.schema.json
## 5. Start (Api + Worker In One Process)
npm run dev
## Or Separately:
npm run start:api # API only
npm run start:worker # Worker only
```
## Environment Variables
All configuration is via `.env` — nothing is hardcoded:
| Variable | Default | Description |
|----------|---------|-------------|
| `PORT` | `3000` | API server port |
| `HOST` | `0.0.0.0` | API bind address |
| `COMFYUI_URL` | `http://127.0.0.1:8188` | ComfyUI server URL |
| `COMFYUI_TIMEOUT_MS` | `300000` | Max wait for ComfyUI (5min) |
| `API_KEYS` | `""` | Comma-separated API keys (`key:role`) |
| `JWT_SECRET` | `""` | JWT signing secret (empty = JWT disabled) |
| `REDIS_URL` | `""` | Redis URL (empty = in-memory queue) |
| `DATABASE_URL` | `./data/gateway.db` | SQLite path or Postgres URL |
| `STORAGE_PROVIDER` | `local` | `local` or `s3` |
| `STORAGE_LOCAL_PATH` | `./data/outputs` | Local output directory |
| `S3_ENDPOINT` | `""` | S3/MinIO endpoint |
| `S3_BUCKET` | `""` | S3 bucket name |
| `S3_ACCESS_KEY` | `""` | S3 access key |
| `S3_SECRET_KEY` | `""` | S3 secret key |
| `S3_REGION` | `us-east-1` | S3 region |
| `WEBHOOK_SECRET` | `""` | HMAC signing secret for webhooks |
| `WEBHOOK_ALLOWED_DOMAINS` | `*` | Comma-separated allowed callback domains |
| `MAX_CONCURRENCY` | `1` | Parallel jobs per GPU |
| `MAX_IMAGE_SIZE` | `2048` | Maximum dimension (width or height) |
| `MAX_BATCH_SIZE` | `4` | Maximum batch size |
| `CACHE_ENABLED` | `true` | Enable result caching |
| `CACHE_TTL_SECONDS` | `86400` | Cache TTL (24h) |
| `RATE_LIMIT_MAX` | `100` | Requests per window |
| `RATE_LIMIT_WINDOW_MS` | `60000` | Rate limit window (1min) |
| `LOG_LEVEL` | `info` | Pino log level |
| `PRIVACY_MODE` | `false` | Redact prompts from logs |
| `CORS_ORIGINS` | `*` | Allowed CORS origins |
| `NODE_ENV` | `development` | Environment |
## Health & Capabilities
```
GET /health
→ { ok: true, version, comfyui: { reachable, url, models? }, uptime }
GET /capabilities
→ { workflows: [...], maxSize, maxBatch, formats, storageProvider }
```
## Workflows (Crud)
```
GET /workflows → list all workflows
POST /workflows → register new workflow
GET /workflows/:id → workflow details + input schema
PUT /workflows/:id → update workflow
DELETE /workflows/:id → remove workflow
```
## Jobs
```
POST /jobs → create job (returns jobId immediately)
GET /jobs/:jobId → status + progress + outputs
GET /jobs/:jobId/logs → sanitized execution logs
POST /jobs/:jobId/cancel → request cancellation
GET /jobs → list jobs (filters: status, workflowId, after, before, limit)
```
## Outputs
```
GET /outputs/:jobId → list output files + metadata
GET /outputs/:jobId/:file → download/stream file
```
## Job Lifecycle
```
queued → running → succeeded
→ failed
→ canceled
```
1. Client POSTs to `/jobs` with workflowId + inputs
2. Gateway validates, checks cache, checks idempotency
3. If cache hit → returns existing outputs immediately (status: `cache_hit`)
4. Otherwise → enqueues job, returns `jobId` + `pollUrl`
5. Worker picks up job, renders workflow template, submits to ComfyUI
6. Worker polls ComfyUI for progress (or listens via WebSocket)
7. On completion → downloads outputs, stores them, updates DB
8. If callbackUrl → sends signed webhook POST
9. Client polls `/jobs/:jobId` or receives webhook
## Workflow Templates
Workflows are ComfyUI JSON with `{{placeholder}}` tokens. The gateway resolves
these at runtime using the job's `inputs` and `params`:
```json
{
"3": {
"class_type": "KSampler",
"inputs": {
"seed": "{{seed}}",
"steps": "{{steps}}",
"cfg": "{{cfg}}",
"sampler_name": "{{sampler}}",
"scheduler": "normal",
"denoise": 1,
"model": ["4", 0],
"positive": ["6", 0],
"negative": ["7", 0],
"latent_image": ["5", 0]
}
},
"6": {
"class_type": "CLIPTextEncode",
"inputs": {
"text": "{{prompt}}",
"clip": ["4", 1]
}
}
}
```
Each workflow has an `inputSchema` (Zod) that validates what the client sends.
## Security Model
- **API Keys**: `X-API-Key` header; keys configured via `API_KEYS` env var as `key1:admin,key2:user`
- **JWT**: Optional; when `JWT_SECRET` is set, accepts `Authorization: Bearer <token>`
- **Roles**: `admin` (full CRUD on workflows + jobs), `user` (create jobs, read own jobs)
- **Rate Limiting**: Per key + per IP, configurable window and max
- **Webhook Security**: HMAC-SHA256 signature in `X-Signature` header
- **Callback Allowlist**: Only approved domains receive webhooks
- **Privacy Mode**: When enabled, prompts are redacted from logs and DB
- **Idempotency**: `metadata.requestId` prevents duplicate processing
- **CORS**: Configurable allowed origins
- **Input Validation**: Zod schemas on every endpoint; max size/batch enforced
## Comfyui Integration
The gateway communicateRelated in Image & Video
watch
IncludedWatch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.
physical-ai-defect-image-generation
IncludedUse when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.
accelint-react-best-practices
IncludedReact performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.
elevenlabs-agents
IncludedBuild conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication
humanizer
IncludedHumanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.
generating-mermaid-diagrams
IncludedSalesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.