Claude
Skills
Sign in
Back

analyzing-macro-malware-in-office-documents

Included with Lifetime
$97 forever

Analyzes malicious VBA macros embedded in Microsoft Office documents (Word, Excel, PowerPoint) to identify download cradles, payload execution, persistence mechanisms, and anti-analysis techniques. Uses olevba, oledump, and VBA deobfuscation to extract the attack chain. Activates for requests involving Office macro analysis, VBA malware investigation, maldoc analysis, or document-based threat examination.

SecuritymalwaremacroOfficeVBAdocument-malwarescripts

What this skill does


# Analyzing Macro Malware in Office Documents

## When to Use

- A suspicious Office document (.doc, .docm, .xls, .xlsm, .ppt) has been flagged by email security
- Investigating phishing campaigns that deliver weaponized Office documents
- Extracting VBA macro code to identify the payload download URL and execution method
- Analyzing obfuscated VBA code to understand the full attack chain
- Determining if a document uses DDE, ActiveX, or remote template injection instead of macros

**Do not use** for analyzing non-macro Office threats (DDE, remote template injection); while this skill covers detection of these, specialized analysis may be needed.

## Prerequisites

- Python 3.8+ with oletools installed (`pip install oletools`)
- oledump.py from Didier Stevens (https://blog.didierstevens.com/programs/oledump-py/)
- Isolated analysis VM without Microsoft Office installed (prevents accidental execution)
- XLMDeobfuscator for Excel 4.0 macro analysis (pip install xlmdeobfuscator)
- LibreOffice for safe document rendering (does not execute VBA macros by default)

## Workflow

### Step 1: Initial Document Triage

Determine if the document contains macros or other active content:

```bash
# Quick triage with olevba
olevba suspect.docm

# Check for OLE streams and macros
oleid suspect.docm

# Output indicators:
# VBA Macros:        True/False
# XLM Macros:        True/False
# External Relationships: True/False (remote template)
# ObjectPool:        True/False (embedded objects)
# Flash:             True/False (SWF objects)

# Comprehensive OLE analysis
oledump.py suspect.docm

# List all OLE streams with macro indicators
# Streams marked with 'M' contain VBA macros
# Streams marked with 'm' contain macro attributes
```

### Step 2: Extract and Analyze VBA Code

Pull out the complete VBA macro source:

```bash
# Extract VBA with full deobfuscation
olevba --decode --deobf suspect.docm

# Extract just the VBA source code
olevba --code suspect.docm > extracted_vba.txt

# Detailed extraction with oledump
oledump.py -s 8 -v suspect.docm  # Stream 8 (adjust based on stream listing)

# Extract all macro streams
oledump.py -p plugin_vba_dco suspect.docm
```

```
Key VBA Elements to Identify:
━━━━━━━━━━━━━━━━━━━━━━━━━━━
Auto-Execution Triggers:
  - Auto_Open / AutoOpen (Word)
  - Auto_Close / AutoClose
  - Document_Open / Document_Close
  - Workbook_Open (Excel)
  - AutoExec

Suspicious Functions:
  - Shell() / Shell.Application
  - WScript.Shell.Run / Exec
  - CreateObject("WScript.Shell")
  - PowerShell execution
  - URLDownloadToFile
  - MSXML2.XMLHTTP (HTTP requests)
  - ADODB.Stream (file writing)
  - Environ() (environment variables)
  - CallByName (indirect method calls)
```

### Step 3: Deobfuscate VBA Code

Remove obfuscation layers to reveal the payload:

```python
# VBA deobfuscation techniques
import re

def deobfuscate_vba(code):
    # 1. Resolve Chr() calls: Chr(104) & Chr(116) -> "ht"
    def resolve_chr(match):
        try:
            return chr(int(match.group(1)))
        except:
            return match.group(0)
    code = re.sub(r'Chr\$?\((\d+)\)', resolve_chr, code)

    # 2. Remove string concatenation: "htt" & "p://" -> "http://"
    code = re.sub(r'"\s*&\s*"', '', code)

    # 3. Resolve ChrW calls: ChrW(104)
    code = re.sub(r'ChrW\$?\((\d+)\)', resolve_chr, code)

    # 4. Resolve StrReverse: StrReverse("exe.daolnwod") -> "download.exe"
    def resolve_reverse(match):
        return '"' + match.group(1)[::-1] + '"'
    code = re.sub(r'StrReverse\("([^"]+)"\)', resolve_reverse, code)

    # 5. Remove Mid$/Left$/Right$ obfuscation (complex, mark for manual review)

    # 6. Resolve Replace(): Replace("Powxershxell", "x", "")
    def resolve_replace(match):
        original = match.group(1)
        find = match.group(2)
        replace_with = match.group(3)
        return '"' + original.replace(find, replace_with) + '"'
    code = re.sub(r'Replace\("([^"]+)",\s*"([^"]+)",\s*"([^"]*)"\)', resolve_replace, code)

    return code

with open("extracted_vba.txt") as f:
    vba_code = f.read()

deobfuscated = deobfuscate_vba(vba_code)
print(deobfuscated)
```

### Step 4: Analyze Excel 4.0 (XLM) Macros

Handle legacy Excel macros that bypass VBA detection:

```bash
# Detect XLM macros
olevba --xlm suspect.xlsm

# Deobfuscate XLM macros
xlmdeobfuscator -f suspect.xlsm

# Manual XLM analysis with oledump
oledump.py suspect.xlsm -p plugin_biff.py

# XLM (Excel 4.0) macro functions to watch for:
# EXEC()       - Execute shell command
# CALL()       - Call DLL function
# REGISTER()   - Register DLL function
# URLDownloadToFileA - Download file
# ALERT()      - Display message (social engineering)
# HALT()       - Stop execution
# GOTO()       - Control flow
# IF()         - Conditional execution
```

### Step 5: Check for Non-Macro Attack Vectors

Examine the document for DDE, remote templates, and embedded objects:

```bash
# Check for DDE (Dynamic Data Exchange)
python3 -c "
import zipfile
import xml.etree.ElementTree as ET
import re

z = zipfile.ZipFile('suspect.docx')
for name in z.namelist():
    if name.endswith('.xml') or name.endswith('.rels'):
        content = z.read(name).decode('utf-8', errors='ignore')
        # DDE field codes
        if 'DDEAUTO' in content or 'DDE ' in content:
            print(f'[!] DDE found in {name}')
            dde_match = re.findall(r'DDEAUTO[^\"]*\"([^\"]+)\"', content)
            for m in dde_match:
                print(f'    Command: {m}')
        # Remote template
        if 'attachedTemplate' in content or 'Target=' in content:
            urls = re.findall(r'Target=\"(https?://[^\"]+)\"', content)
            for url in urls:
                print(f'[!] Remote template URL: {url}')
"

# Check for embedded OLE objects
oledump.py -p plugin_msg.py suspect.docm

# Check relationships for external references
python3 -c "
import zipfile
z = zipfile.ZipFile('suspect.docx')
for name in z.namelist():
    if '.rels' in name:
        content = z.read(name).decode('utf-8', errors='ignore')
        if 'http' in content.lower() or 'ftp' in content.lower():
            print(f'External reference in {name}:')
            import re
            urls = re.findall(r'Target=\"([^\"]+)\"', content)
            for url in urls:
                print(f'  {url}')
"
```

### Step 6: Generate Analysis Report

Document the complete macro malware analysis:

```
Report should include:
- Document metadata (author, creation date, modification date)
- Macro presence and type (VBA, XLM, DDE, remote template)
- Auto-execution trigger identified
- Deobfuscated VBA source code (key functions)
- Download URL(s) for second-stage payloads
- Execution method (Shell, WScript, PowerShell, COM object)
- Social engineering lure description
- Extracted IOCs (URLs, domains, IPs, file hashes)
- YARA rule for the specific document pattern
```

## Key Concepts

| Term | Definition |
|------|------------|
| **VBA Macro** | Visual Basic for Applications code embedded in Office documents that can interact with the OS, download files, and execute commands |
| **Auto_Open** | VBA event procedure that executes automatically when a Word document is opened, the primary trigger for macro malware |
| **OLE (Object Linking and Embedding)** | Microsoft compound document format; Office documents are OLE containers with streams that can contain macros and objects |
| **DDE (Dynamic Data Exchange)** | Legacy Windows IPC mechanism abused in documents to execute commands without macros; triggered by field code updates |
| **Remote Template Injection** | Attack loading a macro-enabled template from a remote URL when the document opens, bypassing initial macro detection |
| **XLM Macros (Excel 4.0)** | Legacy Excel macro language predating VBA; stored in hidden sheets and often missed by traditional VBA analysis tools |
| **Protected View** | Office sandbox that prevents macro execution until the user clicks "Enable Content"; social engineering targets this barrier |

## 

Related in Security