image-annotations

Included with Lifetime

$97 forever

Annotate screenshots, diagrams, and images with callout rectangles, arrows, labels, and color-coded highlights using PIL. Includes rules for animated GIF annotations with timing and pacing.

Image & Video

What this skill does


# Image Annotations

Add visual callouts to any image — screenshots, diagrams, architecture docs, demo frames — using PIL/Pillow. Highlights what changed or what to look at, so reviewers don't have to guess.

## When to Use This Skill

Use this skill when you need to:

- Highlight a specific area in a screenshot for a PR description
- Annotate before/after images to show what changed
- Add labels and callouts to diagrams or architecture images
- Create annotated frames for animated GIF demos

## Prerequisites

```bash
pip install Pillow -q
```

## Color Rules

- **Red (`#E63946`)** — only for "bad" / "removed" things (e.g., circling a bug being fixed)
- **Yellowish-orange (`#FF9F1C`)** — for neutral highlights ("look here", "new feature", etc.)
- Never use red just because it's eye-catching — red = bad/removed

## Font

- Use **Ink Free** (`C:/Windows/Fonts/Inkfree.ttf`) for a handwritten look on Windows
- On Linux/macOS, fall back to `ImageFont.load_default()`
- Size **36** for annotations on ~1400px-wide images
- `stroke_width=1` with `stroke_fill=<same color as fill>` — gives body without being too thick
- Do NOT use white stroke — looks like a bad glow effect

## Shapes

- Prefer **rounded rectangles** over circles/ellipses — less pixelation at edges
- `draw.rounded_rectangle([x1, y1, x2, y2], radius=14, outline=color, width=5)`
- **Padding 18px** around the target content

## Reference Snippet

```python
from PIL import Image, ImageDraw, ImageFont

# Setup
font = ImageFont.truetype('C:/Windows/Fonts/Inkfree.ttf', 36)  # or load_default()
color = '#FF9F1C'  # orange for highlights
stroke = 5
pad = 18

img = Image.open('screenshot.png')
draw = ImageDraw.Draw(img)

# Rounded rect with padding
draw.rounded_rectangle(
    [x1 - pad, y1 - pad, x2 + pad, y2 + pad],
    radius=14, outline=color, width=stroke
)

# Leader line (same thickness as rect)
draw.line([x2 + pad, cy, x2 + pad + 40, cy - 30], fill=color, width=stroke)

# Label — same-color stroke for body, NO white stroke
draw.text(
    (x2 + pad + 45, cy - 60), 'label text',
    fill=color, font=font, stroke_width=1, stroke_fill=color
)

img.save('annotated.png')
```

## Algorithmic Annotation — `annotate.py`

For images with multiple elements to annotate, use the `annotate.py` module below. Save it next to your script and import from it. It handles automatic label placement without overlapping.

### Quick start

```python
from annotate import annotate_image

result = annotate_image(
    'screenshot.png',
    [
        {'elem': (560, 275, 635, 390), 'label': 'button', 'draw_box': True},
        {'elem': (105, 453, 236, 470), 'label': 'status text'},
    ],
    debug=True,
)
result.save('annotated.png')
```

- `elem`: `(x1, y1, x2, y2)` tight bounding box — must be exact pixel coordinates
- `label`: text label (supports `\n` for multi-line)
- `draw_box`: if `True`, draws a rounded rectangle around the element. If `False` (default), draws a V-arrowhead pointing at the element
- `debug`: shows targeting rectangles and candidate heatmap for placement validation

### Coordinate grid helper

**Always use `grid_image()` before annotating an unfamiliar image.** Scaled-down previews display images smaller than actual pixel dimensions — the error compounds as you move away from (0,0).

```python
from annotate import grid_image

grid = grid_image('screenshot.png', step=100)
grid.save('grid.png')
```

Then verify with small crops:

```python
from PIL import Image
img = Image.open('screenshot.png')
crop = img.crop((x1 - 20, y1 - 20, x2 + 20, y2 + 20))
crop.save('verify.png')
```

### Algorithm overview

1. **Ring search**: candidates between MIN_ARROW (25px) and MAX_ARROW (120px) from element edge
2. **Contrast scoring**: prefers placements where label text is readable — `abs(avg_brightness - 147) - std * 0.3 - dist * 0.02`
3. **Joint resolution**: candidates computed independently, placed greedily (best score first)
4. **Hard blocks**: labels cannot overlap any other annotation's element or breathing box
5. **Proximity penalty**: labels within 40px of other placed boxes get a score penalty
6. **Arrow crossing penalty**: -50 for arrows crossing already-placed arrows

### Debug mode colors

| Color | Meaning |
|-------|---------|
| Cyan | Target element box (elem + padding) |
| Gray | Exclusion zone (MIN_ARROW buffer) |
| Red→Green | Candidate heatmap (red=bad, green=good) |
| Magenta | Chosen label position |
| Orange | Final rendered annotation |

### Arrow styles

- **`draw_box=True`**: rounded rectangle + straight line to label, no arrowhead
- **`draw_box=False`**: V-shaped arrowhead with rounded line caps

### `annotate.py` — full module

Save this as `annotate.py` and import from it:

```python
"""
Algorithmic screenshot annotation with automatic label placement.

pip install Pillow numpy
Optional for diff_images: pip install scipy
"""
import math
import numpy as np
from PIL import Image, ImageDraw, ImageFont

# --- Defaults ---
DEFAULT_FONT = 'C:/Windows/Fonts/Inkfree.ttf'
DEFAULT_FONT_SIZE = 32
DEFAULT_COLOR = '#FF9F1C'
DEFAULT_STROKE = 5
MIN_ARROW = 25
MAX_ARROW = 120
TEXT_PAD = 6
BREATH = 18
CROSSING_PENALTY = 50
PROXIMITY_MARGIN = 40
PROXIMITY_PENALTY = 50


def _rect_intersects(a, b):
    return a[0] < b[2] and a[2] > b[0] and a[1] < b[3] and a[3] > b[1]


def _segments_intersect(p1, p2, p3, p4):
    def cross(o, a, b):
        return (a[0] - o[0]) * (b[1] - o[1]) - (a[1] - o[1]) * (b[0] - o[0])
    d1, d2 = cross(p3, p4, p1), cross(p3, p4, p2)
    d3, d4 = cross(p1, p2, p3), cross(p1, p2, p4)
    return ((d1 > 0 and d2 < 0) or (d1 < 0 and d2 > 0)) and \
           ((d3 > 0 and d4 < 0) or (d3 < 0 and d4 > 0))


def _line_rect_exit(cx, cy, tx, ty, rect):
    x1, y1, x2, y2 = rect
    dx, dy = tx - cx, ty - cy
    tmin, tmax = 0.0, 1.0
    for lo, hi, p, d in [(x1, x2, cx, dx), (y1, y2, cy, dy)]:
        if abs(d) < 1e-9:
            continue
        t0, t1 = (lo - p) / d, (hi - p) / d
        if t0 > t1:
            t0, t1 = t1, t0
        tmin, tmax = max(tmin, t0), min(tmax, t1)
    return (cx + dx * tmax, cy + dy * tmax)


def _rect_gap(a, b):
    dx = max(a[0] - b[2], b[0] - a[2], 0)
    dy = max(a[1] - b[3], b[1] - a[3], 0)
    if dx == 0 and dy == 0:
        return 0
    return math.sqrt(dx**2 + dy**2)


def _find_candidates(pixels, W, H, cyan, pw, ph, font):
    cx, cy = (cyan[0] + cyan[2]) / 2, (cyan[1] + cyan[3]) / 2
    excl_zone = (cyan[0] - MIN_ARROW, cyan[1] - MIN_ARROW,
                 cyan[2] + MIN_ARROW, cyan[3] + MIN_ARROW)
    sx1 = max(0, cyan[0] - MAX_ARROW - pw)
    sy1 = max(0, cyan[1] - MAX_ARROW - ph)
    sx2 = min(W - pw, cyan[2] + MAX_ARROW)
    sy2 = min(H - ph, cyan[3] + MAX_ARROW)
    step_x = max(8, min(pw // 2, MAX_ARROW // 3))
    step_y = max(8, min(ph // 2, MAX_ARROW // 3))
    cands = []
    for px in range(sx1, sx2, step_x):
        for py in range(sy1, sy2, step_y):
            pink = (px, py, px + pw, py + ph)
            if _rect_intersects(pink, excl_zone):
                continue
            gl, gr = cyan[0] - pink[2], pink[0] - cyan[2]
            gt, gb = cyan[1] - pink[3], pink[1] - cyan[3]
            hd, vd = max(gl, gr, 0), max(gt, gb, 0)
            ed = math.sqrt(hd**2 + vd**2) if (hd > 0 and vd > 0) else max(hd, vd)
            if ed > MAX_ARROW:
                continue
            region = pixels[py:py + ph, px:px + pw, :3].astype(float)
            score = abs(np.mean(region) - 147) - np.std(region) * 0.3
            dist = math.sqrt((px + pw/2 - cx)**2 + (py + ph/2 - cy)**2)
            score -= dist * 0.02
            cands.append(((px, py), score))
    return cands


def _resolve_placements(annots, font):
    placed = []
    all_elem_zones = []
    for ann in annots:
        all_elem_zones.append(ann['cyan'])
        if ann.get('draw_box', False):
            c = ann['cyan']
            all_elem_zones.append((c[0]-BREATH, c[1]-BREATH, c[2]+BREATH, c[3]+BREATH))
    for ann in sorted(annots, key=lambda a: -a['best_score']):
        pw, ph =

Files: 1

Size: 23.3 KB

Complexity: 27/100

Category: Image & Video

Source: https://github.com/github/awesome-copilot/tree/main/plugins/visual-pr/skills/image-annotations

Related in Image & Video

watch

Included

Watch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.

Image & Videoscriptsfeatured

physical-ai-defect-image-generation

Included

Use when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.

Image & Videoscripts

accelint-react-best-practices

Included

React performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.

Image & Videoscripts

elevenlabs-agents

Included

Build conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication

Image & Videoscripts

humanizer

Included

Humanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.

Image & Videoscripts

generating-mermaid-diagrams

Included

Salesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.

Image & Videoscripts