Claude
Skills
Sign in
Back

brewui:glm-design-to-code

Included with Lifetime
$97 forever

Generates frontend code from design screenshots, mockups, text descriptions, HTML files, or URLs using external GLM vision API. Modes: CREATE, REVIEW, FIX. Output: HTML, React, Flutter. Triggers: design to code, screenshot to code, mockup to code, d2c, generate frontend from image.

Designscripts

What this skill does


<instructions>

# GLM Design-to-Code

Converts design inputs (screenshots, text descriptions, HTML files, URLs) to working frontend code using GLM vision models. Three modes: CREATE, REVIEW, FIX.

**Arguments:** `$ARGUMENTS`

## Mode Routing

| Mode | Flow |
|------|------|
| CREATE | Phase 0 → 0.5 → 1 → 2 → 3 (with auto-fix) → 4 (mandatory verify) → 5 (if --review) |
| REVIEW | Phase 0 → 0.5 → 1 → 5 |
| FIX | Phase 0 → 0.5 → 1 → 6 |

> **PARAMETER PRIORITY:** User prompt arguments ALWAYS override defaults and environment.
> 1. Explicit flags in `$ARGUMENTS` (`--model`, `--profile`, `--provider`, `--framework`) → highest priority
> 2. API keys from `$ARGUMENTS` prompt text (if user pasted a key inline) → override `.env`
> 3. Environment variables (`.env`, shell env) → fallback
> 4. Defaults from `parse-args.sh` → lowest priority
>
> **MANDATORY OUTPUT:** Before ANY API call, output the full resolved configuration table (see Step 2.5 in Phase 2). This applies to ALL modes (CREATE, REVIEW, FIX). The user must always see what was resolved from their input.

---

## Phase 0: Parse Arguments and Gather Config

### Step 1: Parse Flags

**EXECUTE** using Bash tool:
```bash
bash "${CLAUDE_SKILL_DIR}/scripts/parse-args.sh" "$ARGUMENTS" && echo "OK" || echo "FAILED"
```

Output: key=value pairs. Store all values.

> **Also scan `$ARGUMENTS` raw text** for any inline values not captured by flags:
> - API key pasted in prompt text → extract and use (overrides `.env`)
> - Model name mentioned in free text (e.g., "use glm-4.6v") → treat as `--model`
> - Profile/provider mentioned in text → treat as flags

| Key | Default | Options |
|-----|---------|---------|
| `IMAGE` | (required) | Path to screenshot file, URL, HTML file, or text description |
| `INPUT_TYPE` | auto | image, html, text, url |
| `FRAMEWORK` | html | html, react, flutter, custom |
| `PROFILE` | max | max, optimal, efficient |
| `PROVIDER` | zai | zai, openrouter |
| `OUTPUT` | `./d2c-output` | Output directory |
| `REVIEW` | false | true/false |
| `MODE` | create | create, review, fix |
| `FIX_TEXT` | (empty) | Text from --fix "..." |
| `REVIEW_FILE` | (empty) | Path from --review-file |
| `MODEL` | (empty) | Model override from --model |
| `MAX_TOKENS` | 32768 | 32768 (max), 16384 (optimal), 8192 (efficient) |

> **STOP if FAILED** -- check parse-args.sh.

### Step 1.5: Detect Mode

| Condition | Mode |
|-----------|------|
| `--fix` flag present | FIX |
| `--review` flag present | REVIEW |
| Otherwise | CREATE |

### Step 1.7: Classify Intent (MODE=create only)

> Skip this step for REVIEW and FIX modes.

Analyze the user's prompt text (`$ARGUMENTS`) and the input type to classify intent. **Opus classifies automatically -- no AskUserQuestion needed** (exception: INPUT_TYPE=html with ambiguous signal -- ask).

| Intent | Signals | Default GLM instruction |
|--------|---------|------------------------|
| `reproduce` | Polished mockup, "exact", "copy", "pixel-perfect", no modification language | "Reproduce this design as working code. Match every visual detail exactly." |
| `creative` | "sketch", "wireframe", "rough", "make it look professional", "polish" | "This is a rough sketch. Create a polished, professional UI based on this layout. Use modern design, clean typography, harmonious colors." |
| `enhance` | "add a", "include", "put a ... on", existing design + additions | "This is an existing design. Enhance it: {user request}. Keep all existing content intact." |
| `modify` | "change", "update", "make darker", color/font/layout changes | "Modify this design: {user changes}. Keep everything else unchanged." |
| `convert` | "to React", "to Flutter", "convert", INPUT_TYPE=html + different framework | "Convert this {source} to {FRAMEWORK}. Preserve visual appearance." |

**Default:** `reproduce` (matches current behavior when no specific signals detected).

Store as variables for later use:
- `INTENT` -- one of: reproduce, creative, enhance, modify, convert
- `GLM_INSTRUCTION` -- the instruction text (from table above, with placeholders filled from user prompt)

> **Exception:** If INPUT_TYPE=html and no clear intent signal in prompt -- **ASK** using AskUserQuestion:
> ```
> HTML input detected. What would you like to do?
> ```
> Options:
> - "Convert to {FRAMEWORK} (preserve appearance)"
> - "Reproduce as clean HTML/CSS from scratch"
> - "Use as reference -- create improved version"

### Step 2: Process Input by Type

Based on `INPUT_TYPE` from parse-args.sh:

#### If INPUT_TYPE=image
**EXECUTE** using Bash tool:
```bash
IMAGE="IMAGE_PATH_HERE"
[ -f "$IMAGE" ] && file --mime-type "$IMAGE" | grep -qE ': image/' && echo "VALID_IMAGE" || echo "INVALID"
```
> **If INVALID:**

**ASK** using AskUserQuestion:
```
The file "{IMAGE}" is not a valid image. Please provide a valid input:
```
Options:
- "Enter path to screenshot file (PNG/JPG/WebP)"
- "Enter a URL to screenshot"
- "Enter a text description instead"

**On answer:**
- File path -> re-validate as image, update IMAGE and INPUT_TYPE=image
- URL -> update IMAGE, set INPUT_TYPE=url, go to URL processing above
- Text description -> update IMAGE with text, set INPUT_TYPE=text

#### If INPUT_TYPE=url
Take a Playwright screenshot of the URL first:
**EXECUTE** using Bash tool:
```bash
URL="URL_HERE"
npx playwright screenshot --full-page "$URL" /tmp/d2c-url-screenshot.png 2>&1 && echo "SCREENSHOT_OK" || echo "SCREENSHOT_FAILED"
```
> **If SCREENSHOT_OK:** Set IMAGE=/tmp/d2c-url-screenshot.png and continue as image input.
> **If SCREENSHOT_FAILED:** Try using Playwright MCP `browser_navigate` + `browser_take_screenshot`.
> If Playwright MCP also fails:

**ASK** using AskUserQuestion:
```
Could not take screenshot of "{URL}". The URL may be unreachable or Playwright is not available. Choose alternative:
```
Options:
- "I'll provide a screenshot file instead"
- "Convert from text description"
- "Skip -- I'll paste the HTML source"

**On answer:**
- Screenshot file -> ask for path, set INPUT_TYPE=image
- Text description -> ask for description, set INPUT_TYPE=text
- HTML source -> ask for file path, set INPUT_TYPE=html

#### If INPUT_TYPE=html
**EXECUTE** using Bash tool:
```bash
HTML_FILE="HTML_PATH_HERE"
[ -f "$HTML_FILE" ] && echo "HTML_VALID ($(wc -l < "$HTML_FILE" | tr -d ' ') lines)" || echo "HTML_MISSING"
```
> **If HTML_VALID:** Attempt to screenshot the HTML for dual input (image + HTML source):

**EXECUTE** using Bash tool:
```bash
HTML_FILE="HTML_PATH_HERE"
npx playwright screenshot --full-page "file://$(cd "$(dirname "$HTML_FILE")" && pwd)/$(basename "$HTML_FILE")" /tmp/d2c-html-screenshot.png 2>&1 && echo "SCREENSHOT_OK" || echo "SCREENSHOT_FAILED"
```

> **If SCREENSHOT_OK:** Set `HTML_SCREENSHOT=/tmp/d2c-html-screenshot.png`, `DUAL_INPUT=true`. Will use `glm-build-request.sh` with both screenshot and HTML source.

> **If SCREENSHOT_FAILED:** Try fallback:
```bash
command -v wkhtmltoimage >/dev/null 2>&1 && wkhtmltoimage --quality 90 --width 1440 "$HTML_FILE" /tmp/d2c-html-screenshot.png 2>&1 && echo "SCREENSHOT_OK" || echo "SCREENSHOT_FAILED"
```

> **If still FAILED:** Try Playwright MCP `browser_navigate` to `file://` URL + `browser_take_screenshot`.

> **If all fail:**

**ASK** using AskUserQuestion:
```
Could not screenshot the HTML file. Choose how to proceed:
```
Options:
- "I'll provide a screenshot file" (ask for path, set DUAL_INPUT=true)
- "Continue without screenshot (text-only)" (set DUAL_INPUT=false)

> When `DUAL_INPUT=false`: Will use `glm-build-text-request.sh` (text-only with HTML content).

#### If INPUT_TYPE=text
The description text is in the IMAGE field. No validation needed -- will use glm-build-text-request.sh in Phase 2.

### Step 3: Confirm Settings (if no flags provided)

If IMAGE was the only argument (no flags), **ASK** using AskUserQuestion:

```
Design-to-Code Configuration:

Input: {IMAGE} ({INPUT_TYPE})
Framework: html (HTML/CSS), react (React 18 + CSS Modules), flutter (Flutter Web), custom
Profile: max (pixel-

Related in Design