Claude
Skills
Sign in
Back

extracting-iocs-from-malware-samples

Included with Lifetime
$97 forever

Extracts indicators of compromise (IOCs) from malware samples including file hashes, network indicators (IPs, domains, URLs), host artifacts (file paths, registry keys, mutexes), and behavioral patterns for threat intelligence sharing and detection rule creation. Activates for requests involving IOC extraction, threat indicator harvesting, malware indicator collection, or building detection content from samples.

SecuritymalwareIOC-extractionthreat-intelligenceindicatorsdetectionscripts

What this skill does


# Extracting IOCs from Malware Samples

## When to Use

- A malware analysis (static or dynamic) is complete and actionable indicators need to be extracted for defense teams
- Building blocklists for firewalls, proxies, and DNS sinkholes from analyzed samples
- Creating YARA rules, Snort/Suricata signatures, or SIEM detection content from malware artifacts
- Contributing to threat intelligence sharing platforms (MISP, OTX, ThreatConnect)
- Tracking malware campaigns by correlating IOCs across multiple samples

**Do not use** for IOCs from unverified sources without validation; false positives in blocklists can disrupt legitimate business operations.

## Prerequisites

- Python 3.8+ with `iocextract`, `pefile`, `yara-python` libraries installed
- Completed malware analysis report (static analysis, dynamic analysis, or reverse engineering)
- Access to PCAP files, memory dumps, or sandbox reports from the analysis
- MISP instance or STIX/TAXII server for structured IOC sharing
- VirusTotal API key for IOC enrichment and validation
- CyberChef for decoding obfuscated indicators

## Workflow

### Step 1: Extract File-Based IOCs

Compute hashes and identify file metadata indicators:

```bash
# Generate all standard hashes
md5sum malware_sample.exe
sha1sum malware_sample.exe
sha256sum malware_sample.exe

# Generate ssdeep fuzzy hash for similarity matching
ssdeep malware_sample.exe

# Generate imphash (import hash) for PE files
python3 -c "
import pefile
pe = pefile.PE('malware_sample.exe')
print(f'Imphash: {pe.get_imphash()}')
"

# Generate TLSH (Trend Micro Locality Sensitive Hash)
python3 -c "
import tlsh
with open('malware_sample.exe', 'rb') as f:
    h = tlsh.hash(f.read())
print(f'TLSH: {h}')
"

# Compile file metadata IOCs
python3 << 'PYEOF'
import pefile
import os
import hashlib
import datetime

pe = pefile.PE("malware_sample.exe")

print("FILE IOCs:")
with open("malware_sample.exe", "rb") as f:
    data = f.read()
    print(f"  MD5:        {hashlib.md5(data).hexdigest()}")
    print(f"  SHA-1:      {hashlib.sha1(data).hexdigest()}")
    print(f"  SHA-256:    {hashlib.sha256(data).hexdigest()}")
    print(f"  File Size:  {len(data)} bytes")

ts = pe.FILE_HEADER.TimeDateStamp
print(f"  Compile:    {datetime.datetime.utcfromtimestamp(ts)} UTC")
print(f"  Imphash:    {pe.get_imphash()}")
PYEOF
```

### Step 2: Extract Network IOCs

Pull network indicators from strings, PCAP, and sandbox reports:

```python
# Extract network IOCs from strings
import re

with open("malware_sample.exe", "rb") as f:
    data = f.read()

# Extract ASCII and Unicode strings
ascii_strings = re.findall(b'[ -~]{4,}', data)
unicode_strings = re.findall(b'(?:[ -~]\x00){4,}', data)

all_strings = [s.decode('ascii', errors='ignore') for s in ascii_strings]
all_strings += [s.decode('utf-16-le', errors='ignore') for s in unicode_strings]

# IP addresses (excluding private ranges for C2 indicators)
ip_pattern = re.compile(r'\b(?:(?:25[0-5]|2[0-4]\d|1\d{2}|[1-9]?\d)\.){3}(?:25[0-5]|2[0-4]\d|1\d{2}|[1-9]?\d)\b')
ips = set()
for s in all_strings:
    for ip in ip_pattern.findall(s):
        # Filter out private/reserved ranges
        octets = [int(o) for o in ip.split('.')]
        if octets[0] not in [10, 127, 0] and not (octets[0] == 172 and 16 <= octets[1] <= 31) and not (octets[0] == 192 and octets[1] == 168):
            ips.add(ip)

# Domain names
domain_pattern = re.compile(r'\b[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z]{2,})+\b')
domains = set()
for s in all_strings:
    for d in domain_pattern.findall(s):
        if not d.endswith(('.dll', '.exe', '.sys', '.com.au')):
            domains.add(d)

# URLs
url_pattern = re.compile(r'https?://[^\s<>"{}|\\^`\[\]]+')
urls = set()
for s in all_strings:
    for u in url_pattern.findall(s):
        urls.add(u)

print("NETWORK IOCs:")
print(f"  IPs:     {ips}")
print(f"  Domains: {domains}")
print(f"  URLs:    {urls}")
```

### Step 3: Extract Host-Based IOCs

Identify file paths, registry keys, mutexes, and services:

```python
# Extract host-based IOCs from sandbox report
import json

with open("cuckoo_report.json") as f:
    report = json.load(f)

print("HOST IOCs:")

# File paths created or modified
print("\nFile Paths:")
for f in report["behavior"]["summary"].get("files", []):
    if any(p in f.lower() for p in ["temp", "appdata", "system32", "programdata"]):
        print(f"  [DROPPED] {f}")

# Registry keys for persistence
print("\nRegistry Keys:")
for key in report["behavior"]["summary"].get("write_keys", []):
    if any(p in key.lower() for p in ["run", "service", "startup", "shell"]):
        print(f"  [PERSIST] {key}")

# Mutexes (unique to malware family)
print("\nMutexes:")
for mutex in report["behavior"]["summary"].get("mutexes", []):
    if mutex not in ["Local\\!IETld!Mutex", "RasPbFile"]:  # Filter known Windows mutexes
        print(f"  [MUTEX] {mutex}")

# Created services
print("\nServices:")
for svc in report["behavior"]["summary"].get("started_services", []):
    print(f"  [SERVICE] {svc}")
```

### Step 4: Extract Network IOCs from PCAP

Parse network captures for additional indicators:

```bash
# Extract DNS queries from PCAP
tshark -r capture.pcap -T fields -e dns.qry.name -Y "dns.flags.response == 0" | sort -u

# Extract HTTP hosts and URLs
tshark -r capture.pcap -T fields -e http.host -e http.request.uri -Y "http.request" | sort -u

# Extract TLS server names (SNI)
tshark -r capture.pcap -T fields -e tls.handshake.extensions_server_name -Y "tls.handshake.type == 1" | sort -u

# Extract JA3 hashes
tshark -r capture.pcap -T fields -e tls.handshake.ja3 -Y "tls.handshake.type == 1" | sort -u

# Extract unique destination IPs
tshark -r capture.pcap -T fields -e ip.dst -Y "ip.src == 10.0.2.15" | sort -u

# Extract User-Agent strings
tshark -r capture.pcap -T fields -e http.user_agent -Y "http.user_agent" | sort -u
```

### Step 5: Defang and Validate IOCs

Defang indicators for safe sharing and validate against threat intelligence:

```python
# Defang IOCs for safe sharing
def defang_ip(ip):
    return ip.replace(".", "[.]")

def defang_url(url):
    return url.replace("http", "hxxp").replace(".", "[.]")

def defang_domain(domain):
    return domain.replace(".", "[.]")

# Validate IOCs against VirusTotal
import requests

VT_API_KEY = "your_api_key"

def check_vt_ip(ip):
    resp = requests.get(f"https://www.virustotal.com/api/v3/ip_addresses/{ip}",
                       headers={"x-apikey": VT_API_KEY})
    data = resp.json()
    stats = data["data"]["attributes"]["last_analysis_stats"]
    return stats["malicious"]

def check_vt_domain(domain):
    resp = requests.get(f"https://www.virustotal.com/api/v3/domains/{domain}",
                       headers={"x-apikey": VT_API_KEY})
    data = resp.json()
    stats = data["data"]["attributes"]["last_analysis_stats"]
    return stats["malicious"]

# Validate each IOC
for ip in ips:
    detections = check_vt_ip(ip)
    print(f"  {defang_ip(ip)} - VT: {detections} detections")
```

### Step 6: Export IOCs in Standard Formats

Generate structured IOC outputs for sharing and ingestion:

```python
# Export as STIX 2.1 bundle
from stix2 import Indicator, Bundle, Malware, Relationship
import datetime

indicators = []

# File hash indicator
indicators.append(Indicator(
    name="Malware SHA-256 Hash",
    pattern=f"[file:hashes.'SHA-256' = '{sha256_hash}']",
    pattern_type="stix",
    valid_from=datetime.datetime.now(datetime.timezone.utc),
    labels=["malicious-activity"]
))

# IP indicator
for ip in ips:
    indicators.append(Indicator(
        name=f"C2 IP Address {ip}",
        pattern=f"[ipv4-addr:value = '{ip}']",
        pattern_type="stix",
        valid_from=datetime.datetime.now(datetime.timezone.utc),
        labels=["malicious-activity"]
    ))

# Domain indicator
for domain in domains:
    indicators.append(Indicator(
        name=f"C2 Domain {domain}",
        pattern=f"[domain-name:value = '{domain}']",
        pattern_type="stix",
        vali

Related in Security