pdf-conversion-router

Included with Lifetime

$97 forever

Use when converting a PDF into another format such as Markdown, HTML, text, JSON, DOCX, or structured notes and the agent must choose the best extraction route, settings, and cleanup strategy for maximum fidelity and readability.

Ads & Marketing

What this skill does


# PDF Conversion Router

Route every PDF conversion through a short analysis step before choosing tools or CLI flags.

The goal is not "extract the most text". The goal is:
- preserve structure
- preserve attachment between labels and values
- choose the most faithful output shape
- avoid noisy defaults when a better route exists

## When to Use

- The user wants a PDF converted into another format.
- The requested output is `.md`, `.html`, `.txt`, `.json`, `.docx`, or structured notes.
- The PDF may be scanned, OCR-heavy, table-heavy, slide-based, medical, academic, or multi-column.

## Core Rule

Never start with one fixed default pipeline.

Always:
1. classify the PDF
2. classify the target output
3. choose the strongest route for that combination
4. validate the result on representative sections
5. if needed, retry with better settings before delivering

Heuristics are starting points, not guarantees.

Do not promote one flag combination into a universal default just because it worked well on one PDF.
Prefer document-specific evidence over habit.

## Primary Engine Rule

Use `opendataloader-pdf` as the primary conversion engine for every PDF conversion task by default.

This skill should assume:
- `opendataloader-pdf` is always the first conversion attempt
- other tools are used to classify, validate, OCR, inspect, or support cleanup
- other extractors are not the default replacement for the main conversion route

Use other tools only for one of these reasons:
- quick classification of the PDF
- OCR preprocessing before conversion
- validation against layout-preserving text
- manual repair when the generated output is still noisy
- fallback only if `opendataloader-pdf` cannot produce a usable result

## Step 1: Classify the Source PDF

Identify the document class as quickly as possible:

- Native digital PDF with selectable text
- OCR PDF with noisy text
- Image-only/scanned PDF
- Slide deck / presentation export
- Medical or lab report
- Table-heavy business/finance document
- Narrative report / letter / article
- Mixed layout document with diagrams, tables, and prose

Useful fast checks:

```bash
pdfinfo input.pdf
pdftotext -layout input.pdf -
```

If text is missing or very poor, treat OCR as required.

## Document-Type Heuristics

Use these as default starting points:

- medical / lab report
  `markdown-with-html + --table-method cluster + --image-output off`

- slide deck / PowerPoint export
  `markdown-with-html + --image-output off`
  add `--table-method cluster` only if the default route under-structures important tabular content
  if tables are visually obvious but missing or badly fused, treat this as a detection problem, not a Markdown formatting problem
  if the selected route already reconstructs a real table but clips leading characters at column boundaries, treat that as a boundary-splitting defect, not a missing-table failure

- narrative / article / letter
  start with `markdown` or `text`
  use `markdown-with-html` only if structure clearly matters

- table-heavy business / finance PDF
  start with `markdown-with-html`
  add `--table-method cluster` when rows or columns flatten

- scanned / image-heavy PDF
  OCR first, then convert with `opendataloader-pdf`

- mixed-layout PDF
  prefer `markdown-with-html`
  validate one easy section and one hard section before accepting output

## Step 2: Choose the Output Shape

Pick the output that best matches the document and the user's goal.

- `markdown-with-html`
  Use by default when the user wants Markdown and fidelity matters.
  Prefer this for tables, medical reports, slides, mixed-layout PDFs, and anything likely to break in pure Markdown.

- `markdown`
  Use only when clean plain Markdown matters more than layout fidelity.

- `html`
  Use when visual structure matters more than LLM readability.

- `text`
  Use for quick linear extraction, narrative documents, or when structure is unimportant.

- `json`
  Use when downstream machine processing matters more than human readability.

- `docx`
  Use when the user wants editable office output and layout reconstruction matters.

## Step 3: Choose the Extraction Route

### For OpenDataLoader CLI

Use OpenDataLoader as the default route.

Preferred defaults:

- For Markdown output with fidelity priority:
  `-f markdown-with-html`

- For medical PDFs:
  add `--table-method cluster`

- For table-heavy PDFs:
  add `--table-method cluster`

- For slide decks:
  start without `--table-method cluster`
  add it only after a structure check shows meaningful improvement
  if a pseudo-table is already collapsed inside one detected row, changing only the Markdown flavor usually will not fix it
  if the active engine build recovers the pseudo-table structure, prefer fixing residual boundary artifacts before escalating to hybrid/full mode

- For conversions where images are not requested:
  add `--image-output off`

- For slide decks, medical reports, and structure-sensitive PDFs:
  prefer validating both the command success and the actual rendered structure

- For referts/reports where exact values matter:
  validate key sections after conversion instead of trusting first pass

### For medical or lab PDFs

Default route:

```bash
opendataloader-pdf -f markdown-with-html --table-method cluster --image-output off
```

Then verify:
- main table headers
- attachment of value, unit, and reference range
- legends/comments separated from result rows

If a clinical table is flattened, compare against `pdftotext -layout` before accepting output.

### For slide decks

Prefer:

```bash
opendataloader-pdf -f markdown-with-html --image-output off
```

Then check for:
- repeated footers
- page numbers
- diagram pseudo-tables
- orphan symbols and chart labels

If CLI output is still poor, do a cleanup pass tuned for slides instead of assuming the raw extract is final.
If the slide contains obvious table-like blocks that are not detected as tables at all, prefer a same-engine retry with a stronger route such as hybrid/full mode before jumping to unrelated extractors.
If the slide now produces a real table, validate the first column and header boundaries before assuming the table is fully correct.

### For scanned PDFs

If the text layer is poor or absent:
- run OCR first
- then convert the OCR'd PDF with `opendataloader-pdf`

Prefer conservative reconstruction over aggressive guessing.

## Step 4: Validation Gates

Before claiming success, inspect the output for the patterns most likely to break.

For medical PDFs:
- values attached to correct exam names
- units and reference ranges not merged into neighbors
- comments not merged into rows

For slides:
- bullets normalized
- footers/page numbers removed when they are noise
- diagrams not causing crashes
- remaining tables readable enough to follow
- first column labels not losing their first character at inferred column boundaries
- pseudo-table recovery not breaking row grouping or spilling labels into the next column

For table-heavy documents:
- no catastrophic row flattening
- headers preserved
- repeated empty separator rows minimized
- sparse or single-column tables not accidentally collapsed into prose
- table bodies not fused into a single HTML or Markdown row containing many logical records

For every document class:
- check the first representative section, not just the top of the file
- check one complex section, not only a simple section
- prefer document-level confidence over success on page 1

## Red Flags

Treat these as signals that the current output is not ready:

- table rows flattened into long prose lines
- table header looks correct but the entire body is fused into one row with multi-value cells
- labels detached from values
- units or reference ranges drifting into adjacent rows
- repeated page footers or page numbers
- pseudo-tables with mostly empty cells
- legitimate sparse tables collapsed into paragraphs
- single-column tables flattened because they looked "too simple"
- stray symbols,

Files: 1

Size: 15.0 KB

Complexity: 24/100

Category: Ads & Marketing

Source: https://github.com/sickn33/antigravity-awesome-skills/tree/main/plugins/antigravity-awesome-skills/skills/pdf-conversion-router

Related in Ads & Marketing

Included

Multi-platform paid advertising audit and optimization skill. Analyzes Google, Meta, YouTube, LinkedIn, TikTok, Microsoft, and Apple Ads. 250+ checks with scoring, parallel agents, industry templates, and AI creative generation.

Ads & Marketingscriptsfeatured

banana

Included

AI image generation Creative Director powered by Google Gemini Nano Banana models. Use this skill for ANY request involving image creation, editing, visual asset production, or creative direction. Triggers on: generate an image, create a photo, edit this picture, design a logo, make a banner, visual for my anything, and all /banana commands. Handles text-to-image, image editing, multi-turn creative sessions, batch workflows, and brand presets.

Ads & Marketingscriptsfeatured

rpg-migration-analyzer

Included

Analyzes legacy RPG (Report Program Generator) programs from AS/400 and IBM i systems for migration to modern Java applications. Extracts business logic from RPG III/IV/ILE source code, identifies data structures (D-specs), file operations (F-specs), program dependencies (CALLB/CALLP), and converts RPG constructs to Java equivalents. Generates migration reports, complexity estimates, and Java implementation strategies with POJO classes, JPA entities, and service methods. Use when modernizing AS/400 or IBM i legacy systems, analyzing RPG source files (.rpg, .rpgle, .RPGLE), converting RPG to Java, mapping data specifications to Java classes, planning legacy system migration, or when user mentions RPG analysis, Report Program Generator, RPG III/IV/ILE, AS/400 modernization, IBM i migration, packed decimal conversion, or mainframe application rewrite.

Ads & Marketingscripts

brand-library-architect

Included

Build a complete brand library for a product — visual asset render pipeline, brand documentation set (BRAND, COPY, MANIFESTO, BIOS, FAQ, GLOSSARY, TONE, PRICING), open-source convention files (README, CONTRIBUTING, SECURITY, CODE_OF_CONDUCT), and a self-contained press kit. This skill should be used when the user asks to "build a brand library / brand kit / press kit / brand assets" for a product, "set up a brand library workflow," "create a positioning manifesto plus visual identity," or any combination of brand documentation + visual asset pipeline. Apply phase-by-phase or run end-to-end. Templates are product-agnostic and use {{TOKEN}} placeholders the skill prompts the user to fill.

Ads & Marketingscripts

writing-tech-post

Included

Authors engineering blog posts end-to-end: launch deep-dives, incident postmortems, architecture migrations, performance case studies, tutorials, AI/agent system writeups, security disclosures, and research-to-product translations. Picks the correct archetype, plans the abstraction ladder, enforces an evidence cadence (diagrams, benchmarks, profiles, traces, code, ablations), tunes voice against publisher house styles (Datadog, Vercel, GitHub, AWS, Meta, Cloudflare, Jane Street), and runs a pre-publish gate for narrative momentum and disclosure ethics. Use when drafting a new engineering post, restructuring a draft that feels flat, deciding which evidence form belongs where, validating that depth and product context are balanced, or preparing a postmortem, migration, or performance narrative for external publication. Do not use for API reference documentation, README authoring, marketing copy, release notes, generic SEO content, ghost-written executive thought leadership, or non-engineering long-form essays.

Ads & Marketingscripts

blog-google

Included

Google API integration for blog performance: PageSpeed Insights, CrUX Core Web Vitals with 25-week history, Search Console performance, URL Inspection, Indexing API, GA4 organic traffic, NLP entity analysis for E-E-A-T, YouTube video search for embedding, and Google Ads Keyword Planner. Progressive feature availability based on credential tier (API key, OAuth/service account, GA4, Ads). Shares config with claude-seo at ~/.config/claude-seo/google-api.json. Use when user says "google data", "page speed", "core web vitals", "search console", "indexation", "GA4", "keyword research", "nlp entities", "blog performance", "youtube search", "google api setup".

Ads & Marketingscripts

Included

Ads & Marketingscriptsfeatured

pdf-conversion-router

What this skill does

Related in Ads & Marketing

ads

banana

rpg-migration-analyzer

brand-library-architect

writing-tech-post

blog-google

ads

banana

rpg-migration-analyzer

brand-library-architect

writing-tech-post

blog-google