reverse-engineering-malware-with-ghidra
Reverse engineers malware binaries using NSA's Ghidra disassembler and decompiler to understand internal logic, cryptographic routines, C2 protocols, and evasion techniques at the assembly and pseudo-C level. Activates for requests involving malware reverse engineering, disassembly analysis, decompilation, binary analysis, or understanding malware internals.
What this skill does
# Reverse Engineering Malware with Ghidra
## When to Use
- Static and dynamic analysis have identified suspicious functionality that requires deeper code-level understanding
- You need to reverse engineer C2 communication protocols, encryption algorithms, or custom obfuscation
- Understanding the exact exploit mechanism or vulnerability targeted by a malware sample
- Extracting hardcoded configuration data (C2 addresses, encryption keys, campaign IDs) embedded in compiled code
- Developing precise YARA rules or detection signatures based on unique code patterns
**Do not use** for initial triage of unknown samples; perform static analysis with PEStudio and behavioral analysis with Cuckoo first.
## Prerequisites
- Ghidra 11.x installed (download from https://ghidra-sre.org/) with JDK 17+
- Analysis VM isolated from production network (Windows or Linux host)
- Familiarity with x86/x64 assembly language and Windows API conventions
- PDB symbol files for Windows system DLLs to improve decompilation accuracy
- Ghidra scripts repository (ghidra_scripts) for automated analysis tasks
- Secondary reference: IDA Free or Binary Ninja for cross-validation of analysis results
## Workflow
### Step 1: Create Project and Import Binary
Set up a Ghidra project and import the malware sample:
```
1. Launch Ghidra: ghidraRun (Linux) or ghidraRun.bat (Windows)
2. File -> New Project -> Non-Shared Project -> Select directory
3. File -> Import File -> Select malware binary
4. Ghidra auto-detects format (PE, ELF, Mach-O) and architecture
5. Accept default import options (or specify base address if known)
6. Double-click imported file to open in CodeBrowser
7. When prompted, run Auto Analysis with default analyzers enabled
```
**Headless analysis for automation:**
```bash
# Run Ghidra headless analysis with decompiler
/opt/ghidra/support/analyzeHeadless /tmp/ghidra_project MalwareProject \
-import suspect.exe \
-postScript ExportDecompilation.py \
-scriptPath /opt/ghidra/scripts/ \
-deleteProject
```
### Step 2: Identify Key Functions and Entry Points
Navigate the binary to locate critical code sections:
```
Navigation Strategy:
━━━━━━━━━━━━━━━━━━━
1. Start at entry point (OEP) - follow execution from _start/WinMain
2. Check Symbol Tree for imported functions (Window -> Symbol Tree)
3. Search for cross-references to suspicious APIs:
- VirtualAlloc/VirtualAllocEx (memory allocation for injection)
- CreateRemoteThread (remote thread injection)
- CryptEncrypt/CryptDecrypt (encryption operations)
- InternetOpen/HttpSendRequest (C2 communication)
- RegSetValueEx (persistence via registry)
4. Use Search -> For Strings to find embedded URLs, IPs, and paths
5. Check the Functions window sorted by size (large functions often contain core logic)
```
**Ghidra keyboard shortcuts for efficient navigation:**
```
G - Go to address
Ctrl+E - Search for strings
X - Show cross-references to current location
Ctrl+Shift+F - Search memory for byte patterns
L - Rename label/function
; - Add comment
T - Retype variable
Ctrl+L - Retype return value
```
### Step 3: Analyze Decompiled Code
Use Ghidra's decompiler to understand function logic:
```c
// Example: Ghidra decompiler output for a decryption routine
// Analyst renames variables and adds types for clarity
void decrypt_config(BYTE *encrypted_data, int data_len, BYTE *key, int key_len) {
// XOR decryption with rolling key
for (int i = 0; i < data_len; i++) {
encrypted_data[i] = encrypted_data[i] ^ key[i % key_len];
}
return;
}
// Analyst actions in Ghidra:
// 1. Right-click parameters -> Retype to correct types (BYTE*, int)
// 2. Right-click variables -> Rename to meaningful names
// 3. Add comments explaining the algorithm
// 4. Set function signature to propagate types to callers
```
### Step 4: Trace C2 Communication Logic
Follow the network communication code path:
```
Analysis Steps for C2 Protocol Reverse Engineering:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
1. Find InternetOpenA/WinHttpOpen call -> trace to wrapper function
2. Follow data flow from encrypted config -> URL construction
3. Identify HTTP method (GET/POST), headers, and body format
4. Locate response parsing logic (JSON parsing, custom binary protocol)
5. Map the C2 command dispatcher (switch/case or jump table)
6. Document the command set (download, execute, exfiltrate, update, uninstall)
```
**Ghidra Script for extracting C2 configuration:**
```python
# Ghidra Python script: extract_c2_config.py
# Run via Script Manager in Ghidra
from ghidra.program.model.data import StringDataType
from ghidra.program.model.symbol import SourceType
# Search for XOR decryption patterns
listing = currentProgram.getListing()
memory = currentProgram.getMemory()
# Find references to InternetOpenA
symbol_table = currentProgram.getSymbolTable()
for symbol in symbol_table.getExternalSymbols():
if "InternetOpen" in symbol.getName():
refs = getReferencesTo(symbol.getAddress())
for ref in refs:
print("C2 init at: {}".format(ref.getFromAddress()))
```
### Step 5: Analyze Encryption and Obfuscation
Identify and document cryptographic routines:
```
Common Malware Encryption Patterns:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
XOR Cipher: Loop with XOR operation, often single-byte or rolling key
RC4: Two loops (KSA + PRGA), 256-byte S-box initialization
AES: Look for S-box constants (0x63, 0x7C, 0x77...) or calls to CryptEncrypt
Base64: Lookup table with A-Za-z0-9+/= characters
Custom: Combination of arithmetic operations (ADD, SUB, ROL, ROR with XOR)
Identification Tips:
- Search for constants: AES S-box, CRC32 table, MD5 init values
- Look for loop structures operating on byte arrays
- Check for Windows Crypto API usage (CryptAcquireContext -> CryptCreateHash -> CryptEncrypt)
- FindCrypt Ghidra plugin automatically identifies crypto constants
```
### Step 6: Document Findings and Create Detection Signatures
Produce actionable intelligence from reverse engineering:
```bash
# Generate YARA rule from unique code patterns found in Ghidra
cat << 'EOF' > malware_family_x.yar
rule MalwareFamilyX_Decryptor {
meta:
description = "Detects MalwareX decryption routine"
author = "analyst"
date = "2025-09-15"
strings:
// XOR decryption loop with hardcoded key
$decrypt = { 8A 04 0E 32 04 0F 88 04 0E 41 3B CA 7C F3 }
// C2 URL pattern after decryption
$c2_pattern = "/gate.php?id=" ascii
condition:
uint16(0) == 0x5A4D and $decrypt and $c2_pattern
}
EOF
```
## Key Concepts
| Term | Definition |
|------|------------|
| **Disassembly** | Converting machine code bytes into human-readable assembly language instructions; Ghidra's Listing view shows disassembled code |
| **Decompilation** | Lifting assembly code to pseudo-C representation for easier analysis; Ghidra's Decompile window provides this view |
| **Cross-Reference (XREF)** | Reference showing where a function or data address is called from or used; essential for tracing code execution flow |
| **Control Flow Graph (CFG)** | Visual representation of all possible execution paths through a function; reveals branching logic and loops |
| **Original Entry Point (OEP)** | The actual start address of the malware code after unpacking; packers redirect execution through an unpacking stub first |
| **Function Signature** | The return type, name, and parameter types of a function; applying correct signatures improves decompiler output quality |
| **Ghidra Script** | Python or Java automation script executed within Ghidra to perform batch analysis, pattern searching, or data extraction |
## Tools & Systems
- **Ghidra**: NSA's open-source software reverse engineering suite with disassembler, decompiler, and scripting support for multiple architectures
- **IDA Pro/Free**: Industry-standard interactive dRelated in General
modeling-omnistudio-epc-catalog
IncludedSalesforce Industries CME EPC product-modeling skill for Product2-based catalog creation. Use when creating EPC products, configuring product attributes, building offer bundles with Product Child Items, or reviewing EPC DataPack JSON metadata for product catalog changes. TRIGGER when: user creates or updates Product2 EPC records, AttributeAssignment payloads, AttributeMetadata/AttributeDefaultValues, Offer bundles, or ProductChildItem relationships. DO NOT TRIGGER when: designing OmniScripts/FlexCards/Integration Procedures (use building-omnistudio-omniscript, building-omnistudio-flexcard, or building-omnistudio-integration-procedure), implementing Apex business logic (use generating-apex), or troubleshooting deployment pipelines (use deploying-metadata).
relationship-science-coach
IncludedUse this skill for direct, practical adult relationship coaching: couples conflict, repair, trust, marriage, dating, flirting, attachment patterns, emotional connection, sex, desire differences, eroticism, kink negotiation, affection, love languages, breakups, and long-term passion. Draw on Gottman, EFT and Hold Me Tight, attachment science, modern sex research, Perel, Nagoski, Kerner, Schnarch, Love and Stosny, and flexible love-language tools. Be concrete and low-hedge. Redirect only for imminent danger, abuse, coercive control, minors, non-consent, self-harm, stalking, or medical/legal/psychiatric decisions.
building-sf-integrations
IncludedSalesforce integration architecture and runtime plumbing with 120-point scoring. Use this skill to set up Named Credentials, External Credentials, External Services, REST/SOAP callout patterns, Platform Events, and Change Data Capture. TRIGGER when: user sets up Named Credentials, External Services, REST/SOAP callouts, Platform Events, CDC, or touches .namedCredential-meta.xml files. DO NOT TRIGGER when: Connected App/OAuth config (use configuring-connected-apps), Apex-only logic (use generating-apex), or data import/export (use handling-sf-data).
venue-templates
IncludedAccess comprehensive LaTeX templates, formatting requirements, and submission guidelines for major scientific publication venues (Nature, Science, PLOS, IEEE, ACM), academic conferences (NeurIPS, ICML, CVPR, CHI), research posters, and grant proposals (NSF, NIH, DOE, DARPA). This skill should be used when preparing manuscripts for journal submission, conference papers, research posters, or grant proposals and need venue-specific formatting requirements and templates.
let-fate-decide
IncludedDraws the 12 Houses of the Zodiac Tarot spread to inject entropy into planning when prompts are vague, ambiguous, or casually delegated. Interprets the spread to guide next steps. Use when the user says 'let fate decide', 'YOLO', 'whatever', 'idk', or other nonchalant phrases, makes Yu-Gi-Oh references, or when you are about to arbitrarily pick between multiple reasonable approaches. Prefer over ask-questions-if-underspecified when the user's tone is casual or playful rather than precision-seeking.
net-ops
IncludedCross-platform network troubleshooting (Windows, macOS, Linux) via local or remote shell. Use for: DNS broken, can't resolve hostnames, nslookup/dig works but apps fail, NRPT, WFP, scutil, /etc/resolver, systemd-resolved, /etc/resolv.conf, NetworkManager, VPN DNS leak residue (ProtonVPN/Mullvad/WireGuard/AnyConnect), AV/firewall blocking DNS or DoH, Tailscale DNS interaction, intermittent connectivity, remote diagnostics over SSH.