extracting-form-fields

Included with Lifetime

$97 forever

Extract form field data from PDFs as a first step to filling PDF forms

Writing & Docsscripts

What this skill does


# Extracting Form Fields

Prepare working directory and extract field data from PDF forms.

<purpose>
This skill extracts PDF form information into useful JSON.
- Detects fillable vs. non-fillable PDFs
- Extracts PDF content as readable Markdown
- Creates field metadata in common JSON format
</purpose>

## Inputs

- **PDF path**: Path to PDF file (e.g., `/home/user/input.pdf`)

## Process Overview

```plantuml
@startuml SKILL
title Extracting Form Fields - High-Level Workflow
start
:Create working directory;
:Copy interview template;
:Extract PDF content as Markdown;
:Check Fillability;
if (PDF has fillable fields?) then (yes)
  :Fillable workflow
  (see Fillable-Forms.md);
else (no)
  :Non-fillable workflow
  (see Nonfillable-Forms.md);
endif
:**✓ EXTRACTION COMPLETE**;
:Ready for Form Data Model creation;
stop
@enduml
```

## Process

### 1. Create Working Directory

```bash
mkdir <basename>.chatfield
```

### 2. Copy Interview Template

Copy a file from the included `filling-pdf-forms` skill's template. The example path below is relative to this skill directory.

```bash
cp ../filling-pdf-forms/scripts/chatfield_interview_template.py <basename>.chatfield/interview.py
```

### 3. Extract PDF Content

```bash
markitdown <pdf_path> > <basename>.chatfield/<basename>.form.md
```

### 4. Check Fillability

```bash
python scripts/check_fillable_fields.py <pdf_path>
```

**Output:**
- `"This PDF has fillable form fields"` → use fillable workflow
- `"This PDF does not have fillable form fields"` → use non-fillable workflow

### 5. Branch Based on Fillability

#### If Fillable:

Follow ./references/Fillable-Forms.md

#### If Non-fillable:

Follow ./references/Nonfillable-Forms.md

## Output Format

### Fillable PDFs - .form.json

```json
[
  {
    "field_id": "topmostSubform[0].Page1[0].f1_01[0]",
    "type": "text",
    "page": 1,
    "rect": [100, 200, 300, 220],
    "tooltip": "Enter your full legal name",
    "max_length": null
  },
  {
    "field_id": "checkbox_over_18",
    "type": "checkbox",
    "page": 1,
    "rect": [150, 250, 165, 265],
    "checked_value": "/1",
    "unchecked_value": "/Off"
  }
]
```

## References

- ./references/Fillable-Forms.md - Fillable PDF extraction workflow
- ./references/Nonfillable-Forms.md - Non-fillable PDF extraction workflow

Files: 11

Size: 39.0 KB

Complexity: 72/100

Category: Writing & Docs

Source: https://github.com/jhs/chatfield/tree/main/Claude/filling-pdf-forms/skills/extracting-form-fields

Related in Writing & Docs

jax-development

Included

Use this skill when the user is writing, debugging, profiling, refactoring, reviewing, benchmarking, parallelising, exporting, or explaining JAX code, or when they mention JAX, jax.numpy, jit, grad, value_and_grad, vmap, scan, lax, random keys, pytrees, jax.Array, sharding, Mesh, PartitionSpec, NamedSharding, pmap, shard_map, Pallas, XLA, StableHLO, checkify, profiler, or the JAX repo. It helps turn NumPy or PyTorch-style code into pure functional JAX, fix tracer/control-flow/shape/PRNG bugs, remove recompiles and host-device syncs, choose transforms and sharding strategies, inspect jaxpr/lowering/IR, and benchmark compiled code correctly.

Writing & Docsscripts

nature-article-writer

Included

Drafts, rewrites, diagnostically critiques, and style-calibrates primary research manuscripts for Nature and Nature Portfolio journals. Use when the user wants a Nature-style title, summary paragraph or abstract, introduction, results, discussion, methods, figure legends, presubmission enquiry, cover letter, reviewer response, or when a scientific draft sounds generic, jargon-heavy, structurally weak, or AI-ish and needs precise, broad-reader-friendly prose without inventing data, analyses, or references. Best for primary research articles and letters rather than reviews or press releases unless explicitly adapting one.

Writing & Docsscripts

deckrd

Included

Document-driven framework that derives requirements, specifications, implementation plans, and executable tasks from goals through structured AI dialogue. Use when user says "write requirements", "create spec", "plan implementation", "derive tasks", "structure this feature", "break down into tasks", or "document this module". Also use for reverse engineering existing code into docs (/deckrd rev). Do NOT use for direct code writing — use /deckrd-coder after tasks are generated. Do NOT use when the user only wants to run or fix existing code without planning.

Writing & Docsscripts

clinical-decision-support

Included

Generate professional clinical decision support (CDS) documents for pharmaceutical and clinical research settings, including patient cohort analyses (biomarker-stratified with outcomes) and treatment recommendation reports (evidence-based guidelines with decision algorithms). Supports GRADE evidence grading, statistical analysis (hazard ratios, survival curves, waterfall plots), biomarker integration, and regulatory compliance. Outputs publication-ready LaTeX/PDF format optimized for drug development, clinical research, and evidence synthesis.

Writing & Docsscripts

handling-sf-data

Included

Salesforce data operations with 130-point scoring. Use this skill to create, update, delete, bulk import/export, generate test data, and clean up org records using sf CLI and anonymous Apex. TRIGGER when: user creates test data, performs bulk import/export, uses sf data CLI commands, needs data factory patterns for Apex tests, or needs to seed/clean records in a Salesforce org. DO NOT TRIGGER when: SOQL query writing only (use querying-soql), Apex test execution (use running-apex-tests), or metadata deployment (use deploying-metadata).

Writing & Docsscripts

accelint-ac-to-playwright

Included

Convert and validate acceptance criteria for Playwright test automation. Use when user asks to (1) review/evaluate/check if AC are ready for automation, (2) assess if AC can be converted as-is, (3) validate AC quality for Playwright, (4) turn AC into tests, (5) generate tests from acceptance criteria, (6) convert .md bullets or .feature Gherkin files to Playwright specs, (7) create test automation from requirements. Handles both bullet-style markdown and Gherkin syntax with JSON test plan generation and validation.

Writing & Docsscripts