analyzing-macro-malware-in-office-documents

Included with Lifetime

$97 forever

Analyzes malicious VBA macros embedded in Microsoft Office documents (Word, Excel, PowerPoint) to identify download cradles, payload execution, persistence mechanisms, and anti-analysis techniques. Uses olevba, oledump, and VBA deobfuscation to extract the attack chain. Activates for requests involving Office macro analysis, VBA malware investigation, maldoc analysis, or document-based threat examination.

SecuritymalwaremacroOfficeVBAdocument-malwarescripts

What this skill does


# Analyzing Macro Malware in Office Documents

## When to Use

- A suspicious Office document (.doc, .docm, .xls, .xlsm, .ppt) has been flagged by email security
- Investigating phishing campaigns that deliver weaponized Office documents
- Extracting VBA macro code to identify the payload download URL and execution method
- Analyzing obfuscated VBA code to understand the full attack chain
- Determining if a document uses DDE, ActiveX, or remote template injection instead of macros

**Do not use** for analyzing non-macro Office threats (DDE, remote template injection); while this skill covers detection of these, specialized analysis may be needed.

## Prerequisites

- Python 3.8+ with oletools installed (`pip install oletools`)
- oledump.py from Didier Stevens (https://blog.didierstevens.com/programs/oledump-py/)
- Isolated analysis VM without Microsoft Office installed (prevents accidental execution)
- XLMDeobfuscator for Excel 4.0 macro analysis (pip install xlmdeobfuscator)
- LibreOffice for safe document rendering (does not execute VBA macros by default)

## Workflow

### Step 1: Initial Document Triage

Determine if the document contains macros or other active content:

```bash
# Quick triage with olevba
olevba suspect.docm

# Check for OLE streams and macros
oleid suspect.docm

# Output indicators:
# VBA Macros:        True/False
# XLM Macros:        True/False
# External Relationships: True/False (remote template)
# ObjectPool:        True/False (embedded objects)
# Flash:             True/False (SWF objects)

# Comprehensive OLE analysis
oledump.py suspect.docm

# List all OLE streams with macro indicators
# Streams marked with 'M' contain VBA macros
# Streams marked with 'm' contain macro attributes
```

### Step 2: Extract and Analyze VBA Code

Pull out the complete VBA macro source:

```bash
# Extract VBA with full deobfuscation
olevba --decode --deobf suspect.docm

# Extract just the VBA source code
olevba --code suspect.docm > extracted_vba.txt

# Detailed extraction with oledump
oledump.py -s 8 -v suspect.docm  # Stream 8 (adjust based on stream listing)

# Extract all macro streams
oledump.py -p plugin_vba_dco suspect.docm
```

```
Key VBA Elements to Identify:
━━━━━━━━━━━━━━━━━━━━━━━━━━━
Auto-Execution Triggers:
  - Auto_Open / AutoOpen (Word)
  - Auto_Close / AutoClose
  - Document_Open / Document_Close
  - Workbook_Open (Excel)
  - AutoExec

Suspicious Functions:
  - Shell() / Shell.Application
  - WScript.Shell.Run / Exec
  - CreateObject("WScript.Shell")
  - PowerShell execution
  - URLDownloadToFile
  - MSXML2.XMLHTTP (HTTP requests)
  - ADODB.Stream (file writing)
  - Environ() (environment variables)
  - CallByName (indirect method calls)
```

### Step 3: Deobfuscate VBA Code

Remove obfuscation layers to reveal the payload:

```python
# VBA deobfuscation techniques
import re

def deobfuscate_vba(code):
    # 1. Resolve Chr() calls: Chr(104) & Chr(116) -> "ht"
    def resolve_chr(match):
        try:
            return chr(int(match.group(1)))
        except:
            return match.group(0)
    code = re.sub(r'Chr\$?\((\d+)\)', resolve_chr, code)

    # 2. Remove string concatenation: "htt" & "p://" -> "http://"
    code = re.sub(r'"\s*&\s*"', '', code)

    # 3. Resolve ChrW calls: ChrW(104)
    code = re.sub(r'ChrW\$?\((\d+)\)', resolve_chr, code)

    # 4. Resolve StrReverse: StrReverse("exe.daolnwod") -> "download.exe"
    def resolve_reverse(match):
        return '"' + match.group(1)[::-1] + '"'
    code = re.sub(r'StrReverse\("([^"]+)"\)', resolve_reverse, code)

    # 5. Remove Mid$/Left$/Right$ obfuscation (complex, mark for manual review)

    # 6. Resolve Replace(): Replace("Powxershxell", "x", "")
    def resolve_replace(match):
        original = match.group(1)
        find = match.group(2)
        replace_with = match.group(3)
        return '"' + original.replace(find, replace_with) + '"'
    code = re.sub(r'Replace\("([^"]+)",\s*"([^"]+)",\s*"([^"]*)"\)', resolve_replace, code)

    return code

with open("extracted_vba.txt") as f:
    vba_code = f.read()

deobfuscated = deobfuscate_vba(vba_code)
print(deobfuscated)
```

### Step 4: Analyze Excel 4.0 (XLM) Macros

Handle legacy Excel macros that bypass VBA detection:

```bash
# Detect XLM macros
olevba --xlm suspect.xlsm

# Deobfuscate XLM macros
xlmdeobfuscator -f suspect.xlsm

# Manual XLM analysis with oledump
oledump.py suspect.xlsm -p plugin_biff.py

# XLM (Excel 4.0) macro functions to watch for:
# EXEC()       - Execute shell command
# CALL()       - Call DLL function
# REGISTER()   - Register DLL function
# URLDownloadToFileA - Download file
# ALERT()      - Display message (social engineering)
# HALT()       - Stop execution
# GOTO()       - Control flow
# IF()         - Conditional execution
```

### Step 5: Check for Non-Macro Attack Vectors

Examine the document for DDE, remote templates, and embedded objects:

```bash
# Check for DDE (Dynamic Data Exchange)
python3 -c "
import zipfile
import xml.etree.ElementTree as ET
import re

z = zipfile.ZipFile('suspect.docx')
for name in z.namelist():
    if name.endswith('.xml') or name.endswith('.rels'):
        content = z.read(name).decode('utf-8', errors='ignore')
        # DDE field codes
        if 'DDEAUTO' in content or 'DDE ' in content:
            print(f'[!] DDE found in {name}')
            dde_match = re.findall(r'DDEAUTO[^\"]*\"([^\"]+)\"', content)
            for m in dde_match:
                print(f'    Command: {m}')
        # Remote template
        if 'attachedTemplate' in content or 'Target=' in content:
            urls = re.findall(r'Target=\"(https?://[^\"]+)\"', content)
            for url in urls:
                print(f'[!] Remote template URL: {url}')
"

# Check for embedded OLE objects
oledump.py -p plugin_msg.py suspect.docm

# Check relationships for external references
python3 -c "
import zipfile
z = zipfile.ZipFile('suspect.docx')
for name in z.namelist():
    if '.rels' in name:
        content = z.read(name).decode('utf-8', errors='ignore')
        if 'http' in content.lower() or 'ftp' in content.lower():
            print(f'External reference in {name}:')
            import re
            urls = re.findall(r'Target=\"([^\"]+)\"', content)
            for url in urls:
                print(f'  {url}')
"
```

### Step 6: Generate Analysis Report

Document the complete macro malware analysis:

```
Report should include:
- Document metadata (author, creation date, modification date)
- Macro presence and type (VBA, XLM, DDE, remote template)
- Auto-execution trigger identified
- Deobfuscated VBA source code (key functions)
- Download URL(s) for second-stage payloads
- Execution method (Shell, WScript, PowerShell, COM object)
- Social engineering lure description
- Extracted IOCs (URLs, domains, IPs, file hashes)
- YARA rule for the specific document pattern
```

## Key Concepts

| Term | Definition |
|------|------------|
| **VBA Macro** | Visual Basic for Applications code embedded in Office documents that can interact with the OS, download files, and execute commands |
| **Auto_Open** | VBA event procedure that executes automatically when a Word document is opened, the primary trigger for macro malware |
| **OLE (Object Linking and Embedding)** | Microsoft compound document format; Office documents are OLE containers with streams that can contain macros and objects |
| **DDE (Dynamic Data Exchange)** | Legacy Windows IPC mechanism abused in documents to execute commands without macros; triggered by field code updates |
| **Remote Template Injection** | Attack loading a macro-enabled template from a remote URL when the document opens, bypassing initial macro detection |
| **XLM Macros (Excel 4.0)** | Legacy Excel macro language predating VBA; stored in hidden sheets and often missed by traditional VBA analysis tools |
| **Protected View** | Office sandbox that prevents macro execution until the user clicks "Enable Content"; social engineering targets this barrier |

##

Files: 4

Size: 34.4 KB

Complexity: 68/100

Category: Security

Source: https://github.com/mukul975/anthropic-cybersecurity-skills/tree/main/skills/analyzing-macro-malware-in-office-documents

Related in Security

mac-ops

Included

Comprehensive macOS workstation operations — diagnose kernel panics, identify failing drives, audit launchd startup items, decode wake reasons, triage TCC permission denials, manage APFS snapshots, recover from no-boot. Use for: Mac is slow, slow bootup, won't boot, kernel panic, kernel_task hot, mds_stores CPU, photoanalysisd, cloudd, login loop, gray screen, sleep wake failure, drive failing, IO errors, APFS snapshots eating space, Time Machine local snapshots, Spotlight indexing, launchd, LaunchAgent, LaunchDaemon, login items, TCC permissions, Full Disk Access, Screen Recording denied, Gatekeeper, quarantine, com.apple.quarantine, app is damaged, helper tool, /Library/PrivilegedHelperTools, pmset, wake reasons, dark wake, sysdiagnose, panic.ips, DiagnosticReports, configuration profile, MDM profile, remote diagnostics over SSH.

Securityscripts

a11y-audit

Included

Run accessibility audits on web projects combining automated scanning (axe-core, Lighthouse) with WCAG 2.1 AA compliance mapping, manual check guidance, and structured reporting. Output is configurable: markdown report only, markdown plus machine-readable JSON, or markdown plus issue tracker integration. Use this skill whenever the user mentions "accessibility audit", "a11y audit", "WCAG audit", "accessibility check", "compliance scan", or asks to check a web project for accessibility issues. Also trigger when the user wants to verify WCAG conformance or map findings to a specific standard (CAN-ASC-6.2, EN 301 549, ADA/AODA).

Securityscripts

erpclaw

Included

AI-native ERP system with self-extending OS. Full accounting, invoicing, inventory, purchasing, tax, billing, HR, payroll, advanced accounting (ASC 606/842, intercompany, consolidation), and financial reporting. 413 actions across 14 domains, 43 expansion modules. Constitutional guardrails, adversarial audit, schema migration. Double-entry GL, immutable audit trail, US GAAP.

Securityscripts

assess

Included

Assesses and rates quality 0-10 across multiple dimensions (correctness, maintainability, security, performance, testability, simplicity) with pros/cons analysis. Compares against project conventions and prior decisions from memory. Produces structured evaluation reports with actionable improvement suggestions. Use when evaluating code, designs, architectures, or comparing alternative approaches.

Securityscripts

spring-boot-security-jwt

Included

Provides JWT authentication and authorization patterns for Spring Boot 3.5.x covering token generation with JJWT, Bearer/cookie authentication, database/OAuth2 integration, and RBAC/permission-based access control using Spring Security 6.x. Use when implementing authentication or authorization in Spring Boot applications.

Securityscripts

code-hardcode-audit

Included

Detect hardcoded values, magic numbers, and leaked secrets. TRIGGERS - hardcode audit, magic numbers, PLR2004, secret scanning.

Securityscripts

Detect hardcoded values, magic numbers, and leaked secrets. TRIGGERS - hardcode audit, magic numbers, PLR2004, secret scanning.

Securityscripts