performing-firmware-extraction-with-binwalk

Included with Lifetime

$97 forever

Performs firmware image extraction and analysis using binwalk to identify embedded filesystems, compressed archives, bootloaders, kernel images, and cryptographic material. Covers entropy analysis for detecting encrypted or compressed regions, recursive extraction of nested archives, SquashFS/CramFS/JFFS2 filesystem mounting, and string analysis for credential and configuration discovery. Activates for requests involving firmware reverse engineering, IoT device analysis, embedded system security assessment, or router/camera firmware extraction.

Image & VideofirmwarebinwalkextractionentropyIoT-securityreverse-engineeringscripts

What this skill does


# Performing Firmware Extraction with Binwalk

## When to Use

- Analyzing IoT device firmware downloaded from vendor sites or extracted from flash chips
- Reverse engineering router, camera, or embedded device firmware for vulnerability research
- Identifying embedded filesystems (SquashFS, CramFS, JFFS2, UBIFS) within firmware blobs
- Detecting encrypted or compressed regions using entropy analysis
- Extracting hardcoded credentials, API keys, certificates, or configuration files from firmware
- Performing security assessments of embedded devices in authorized penetration tests

**Do not use** for analyzing standard desktop application binaries or malware samples that are not firmware images; use dedicated malware analysis tools instead.

## Prerequisites

- binwalk v3.x installed (`pip install binwalk3` or from system package manager)
- Python 3.8+ with standard libraries (struct, math, hashlib, subprocess)
- SquashFS tools (`unsquashfs`) for mounting extracted SquashFS filesystems
- Jefferson for JFFS2 filesystem extraction (`pip install jefferson`)
- Sasquatch for non-standard SquashFS variants used by vendors like TP-Link and D-Link
- `strings` utility (GNU binutils) for string extraction
- Optional: firmware-mod-kit for repacking modified firmware images

## Workflow

### Step 1: Initial Firmware Reconnaissance

Perform a signature scan to identify embedded file types and their offsets:

```bash
# Basic signature scan - identify all recognized file types
binwalk firmware.bin

# Scan with verbose output showing confidence levels
binwalk -v firmware.bin

# Scan for specific file types only
binwalk -y "squashfs" firmware.bin
binwalk -y "gzip\|lzma\|xz" firmware.bin

# Opcode scan to identify CPU architecture
binwalk -A firmware.bin

# Scan for raw strings to find version info, URLs, credentials
binwalk -R "password" firmware.bin
binwalk -R "http://" firmware.bin
```

### Step 2: Entropy Analysis

Analyze entropy to identify encrypted, compressed, and plaintext regions:

```bash
# Generate entropy plot
binwalk -E firmware.bin

# Entropy with specific block size for higher resolution
binwalk -E -K 256 firmware.bin

# Combined entropy and signature scan
binwalk -BE firmware.bin
```

Interpreting entropy values:
- **0.0 - 1.0**: Empty or padding regions (null bytes, 0xFF fill)
- **1.0 - 5.0**: Plaintext data, code, ASCII strings, configuration
- **5.0 - 7.0**: Compressed data (gzip, LZMA, zlib)
- **7.0 - 7.99**: Strongly compressed or encrypted data
- **~8.0**: Maximum entropy, likely encrypted or random data

### Step 3: Extract Embedded Files

Extract all identified components from the firmware image:

```bash
# Automatic extraction of known file types
binwalk -e firmware.bin

# Recursive extraction (matryoshka mode) for nested archives
binwalk -Me firmware.bin

# Recursive extraction with depth limit
binwalk -Me -d 5 firmware.bin

# Extract specific file type with custom handler
binwalk -D "squashfs filesystem:squashfs:unsquashfs %e" firmware.bin

# Manual extraction of data at a known offset
dd if=firmware.bin of=extracted.squashfs bs=1 skip=327680 count=4194304
```

### Step 4: Mount and Inspect Extracted Filesystems

Mount extracted filesystems for deep inspection:

```bash
# Mount SquashFS filesystem
mkdir /tmp/squashfs_root
unsquashfs -d /tmp/squashfs_root extracted.squashfs

# Mount CramFS filesystem
mkdir /tmp/cramfs_root
mount -t cramfs -o loop extracted.cramfs /tmp/cramfs_root

# Extract JFFS2 filesystem
jefferson extracted.jffs2 -d /tmp/jffs2_root

# Inspect the extracted filesystem
ls -la /tmp/squashfs_root/
find /tmp/squashfs_root -name "*.conf" -o -name "*.cfg" -o -name "*.key"
find /tmp/squashfs_root -name "passwd" -o -name "shadow"
```

### Step 5: String Analysis and Credential Discovery

Search extracted filesystem and raw firmware for sensitive data:

```bash
# Extract all printable strings
strings -a firmware.bin > all_strings.txt
strings -n 12 firmware.bin | sort -u > long_strings.txt

# Search for credentials and secrets
grep -rni "password\|passwd\|secret\|api_key\|token" /tmp/squashfs_root/etc/
grep -rni "BEGIN.*PRIVATE KEY" /tmp/squashfs_root/

# Find hardcoded URLs and endpoints
grep -rnoE "https?://[a-zA-Z0-9./?=_-]+" /tmp/squashfs_root/

# Search for certificate files
find /tmp/squashfs_root -name "*.pem" -o -name "*.crt" -o -name "*.key" -o -name "*.p12"

# Identify busybox and service versions
strings /tmp/squashfs_root/bin/busybox | grep "BusyBox v"
cat /tmp/squashfs_root/etc/banner 2>/dev/null
```

### Step 6: Generate Firmware Analysis Report

Compile comprehensive extraction and analysis findings:

```
Report should include:
- Firmware metadata (vendor, model, version, build date)
- Identified components with offsets and sizes (bootloader, kernel, filesystem, config)
- Entropy analysis summary with regions of interest
- Extracted filesystem structure and key contents
- Discovered credentials, keys, certificates
- Identified services, daemons, and their versions
- Known CVEs applicable to identified component versions
- Recommendations for hardening or vulnerability remediation
```

## Key Concepts

| Term | Definition |
|------|------------|
| **Firmware** | Software embedded in hardware devices providing low-level control; typically contains a bootloader, kernel, root filesystem, and configuration data |
| **Entropy Analysis** | Statistical measurement of randomness in binary data; high entropy indicates encryption or compression, low entropy indicates plaintext or structured data |
| **SquashFS** | Read-only compressed filesystem commonly used in embedded Linux devices; supports LZMA, gzip, LZO, and zstd compression |
| **Magic Bytes** | Known byte sequences at fixed offsets that identify file types; binwalk uses a database of magic signatures to detect embedded files |
| **Matryoshka Extraction** | Recursive extraction mode where binwalk re-scans extracted files for additional embedded content, handling deeply nested archives |
| **CramFS** | Compressed ROM filesystem designed for embedded systems with limited flash storage; supports only zlib compression |
| **JFFS2** | Journalling Flash File System version 2, designed for NOR and NAND flash memory in embedded devices |

## Tools & Systems

- **binwalk**: Primary firmware analysis tool for signature scanning, entropy analysis, and automated extraction of embedded files
- **unsquashfs**: SquashFS extraction utility for mounting read-only compressed filesystems found in router and IoT firmware
- **jefferson**: Python tool for extracting JFFS2 flash filesystem images commonly found in embedded devices
- **sasquatch**: Patched SquashFS utility supporting non-standard vendor-modified SquashFS variants
- **firmware-mod-kit**: Toolkit for extracting, modifying, and repacking firmware images for security testing

## Common Scenarios

### Scenario: Extracting and Auditing Router Firmware for Hardcoded Credentials

**Context**: A security researcher is performing an authorized assessment of a consumer router. The firmware update file was downloaded from the vendor's support page. The goal is to identify hardcoded credentials, insecure default configurations, and known vulnerable components.

**Approach**:
1. Run `binwalk -e firmware.bin` to perform initial extraction
2. Use `binwalk -E firmware.bin` to check entropy and identify encrypted regions
3. Locate the SquashFS root filesystem in the extracted output
4. Mount with `unsquashfs` and inspect `/etc/passwd`, `/etc/shadow`, and web server configs
5. Search for hardcoded credentials with `grep -rni "password" /tmp/root/etc/`
6. Identify service versions and cross-reference with CVE databases
7. Check for debug interfaces (telnet, UART, JTAG references) in startup scripts
8. Examine web application code for authentication bypass or command injection

**Pitfalls**:
- Some vendors use non-standard SquashFS with custom compression; use sasquatch instead of unsquashfs
- Encrypted firmware requires decryption keys often fo

Files: 4

Size: 42.4 KB

Complexity: 70/100

Category: Image & Video

Source: https://github.com/mukul975/anthropic-cybersecurity-skills/tree/main/skills/performing-firmware-extraction-with-binwalk

Related in Image & Video

watch

Included

Watch a video (URL or local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or Whisper API fallback), and hands the result to Claude so it can answer questions about what's in the video.

Image & Videoscriptsfeatured

physical-ai-defect-image-generation

Included

Use when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path performs inference and labeling on real images. This skill helps with first-time asset setup, creation of finetuning checkpoints, and configuring deployment. Trigger keywords: defect image generation, dig workflow, dig pipeline, defect image detection workflow, aoi pipeline, aoi anomalygen, usd2roi anomalygen, day 0 pcba, day 1 pcba, day 1 real-photo alignment, day 1 manual roi, metal surface anomaly, glass defect, anomalygen finetune, setup_pcb, setup_metal, setup_glass, setup_pretrained, dig setup, dig datasets, dig pretrained checkpoint, dig image-edit endpoint.

Image & Videoscripts

accelint-react-best-practices

Included

React performance optimization and best practices. ALWAYS use this skill when working with any React code - writing components, hooks, JSX; refactoring; optimizing re-renders, memoization, state management; reviewing for performance; fixing hydration mismatches; debugging infinite re-renders, stale closures, input focus loss, animations restarting; preventing remounting; implementing transitions, lazy initialization, effect dependencies. Even simple React tasks benefit from these patterns. Covers React 19+ (useEffectEvent, Activity, ref props). Triggers - useEffect, useState, useMemo, useCallback, memo, inline components, nested components, components inside components, re-render, performance, hydration, SSR, Next.js, useDeferredValue, combined hooks.

Image & Videoscripts

elevenlabs-agents

Included

Build conversational AI voice agents with ElevenLabs Platform using React, JavaScript, React Native, or Swift SDKs. Configure agents, tools (client/server/MCP), RAG knowledge bases, multi-voice, and Scribe real-time STT. Use when: building voice chat interfaces, implementing AI phone agents with Twilio, configuring agent workflows or tools, adding RAG knowledge bases, testing with CLI "agents as code", or troubleshooting deprecated @11labs packages, Android audio cutoff, CSP violations, dynamic variables, or WebRTC config. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication

Image & Videoscripts

humanizer

Included

Humanize AI-generated text by detecting and removing patterns typical of LLM output. Rewrites text to sound natural, specific, and human. Uses 28 pattern detectors, 560+ AI vocabulary terms across 3 tiers, and statistical analysis (burstiness, type-token ratio, readability) for comprehensive detection. Use when asked to humanize text, de-AI writing, make content sound more natural/human, review writing for AI patterns, score text for AI detection, or improve AI-generated drafts. Covers content, language, style, communication, and filler categories.

Image & Videoscripts

generating-mermaid-diagrams

Included

Salesforce architecture diagrams using Mermaid with ASCII fallback. Use this skill when generating text-based diagrams for Salesforce architecture, OAuth flows, ERDs, integration sequences, or Agentforce structure. TRIGGER when: user says "diagram", "visualize", "ERD", or asks for sequence diagrams, flowcharts, class diagrams, or architecture visualizations in Mermaid. DO NOT TRIGGER when: user wants PNG/SVG image output (use generating-visual-diagrams), or asks about non-Salesforce systems.

Image & Videoscripts