analyzing-typosquatting-domains-with-dnstwist
Detect typosquatting, homograph phishing, and brand impersonation domains using dnstwist to generate domain permutations and identify registered lookalike domains targeting your organization.
What this skill does
# Analyzing Typosquatting Domains with DNSTwist
## Overview
DNSTwist is a domain name permutation engine that generates similar-looking domain names to detect typosquatting, homograph phishing attacks, and brand impersonation. It creates thousands of domain permutations using techniques like character substitution, transposition, insertion, omission, and homoglyph replacement, then checks DNS records (A, AAAA, NS, MX), calculates web page similarity using fuzzy hashing (ssdeep) and perceptual hashing (pHash), and identifies potentially malicious registered domains.
## When to Use
- When investigating security incidents that require analyzing typosquatting domains with dnstwist
- When building detection rules or threat hunting queries for this domain
- When SOC analysts need structured procedures for this analysis type
- When validating security monitoring coverage for related attack techniques
## Prerequisites
- Python 3.9+ with `dnstwist` installed (`pip install dnstwist[full]`)
- Optional: GeoIP database for IP geolocation
- Optional: Shodan API key for enrichment
- Network access to perform DNS queries
- Understanding of DNS record types and domain registration
## Key Concepts
### Domain Permutation Techniques
DNSTwist generates permutations using: addition (appending characters), bitsquatting (bit-flip errors), homoglyph (visually similar Unicode characters like rn vs m), hyphenation (adding hyphens), insertion (inserting characters), omission (removing characters), repetition (repeating characters), replacement (replacing with adjacent keyboard keys), subdomain (inserting dots), transposition (swapping adjacent characters), vowel-swap (swapping vowels), and dictionary-based (appending common words).
### Fuzzy Hashing and Visual Similarity
DNSTwist uses ssdeep (locality-sensitive hash) to compare HTML content and pHash (perceptual hash) to compare screenshots of web pages. This helps identify cloned phishing sites that visually mimic the legitimate site. A high similarity score indicates a likely phishing page.
### Detection Workflow
The typical workflow is: generate domain permutations -> resolve DNS records -> check for registered domains -> compare web page similarity -> flag suspicious domains -> alert security team -> request takedown. For a typical corporate domain, dnstwist generates 5,000-10,000 permutations.
## Workflow
### Step 1: Basic Domain Permutation Scan
```python
import subprocess
import json
import csv
from datetime import datetime
def run_dnstwist_scan(domain, output_file=None):
"""Run dnstwist scan against a target domain."""
cmd = [
"dnstwist",
"--registered", # Only show registered domains
"--format", "json", # Output in JSON
"--nameservers", "8.8.8.8,1.1.1.1",
"--threads", "50",
"--mxcheck", # Check MX records
"--ssdeep", # Fuzzy hash comparison
"--geoip", # GeoIP lookup
domain,
]
print(f"[*] Scanning permutations for: {domain}")
result = subprocess.run(cmd, capture_output=True, text=True, timeout=600)
if result.returncode == 0:
results = json.loads(result.stdout)
registered = [r for r in results if r.get("dns_a") or r.get("dns_aaaa")]
print(f"[+] Found {len(registered)} registered lookalike domains")
if output_file:
with open(output_file, "w") as f:
json.dump(registered, f, indent=2)
print(f"[+] Results saved to {output_file}")
return registered
else:
print(f"[-] dnstwist error: {result.stderr}")
return []
results = run_dnstwist_scan("example.com", "typosquat_results.json")
```
### Step 2: Analyze and Prioritize Results
```python
def analyze_results(results, legitimate_ips=None):
"""Analyze dnstwist results and prioritize threats."""
legitimate_ips = legitimate_ips or set()
high_risk = []
medium_risk = []
low_risk = []
for entry in results:
domain = entry.get("domain", "")
fuzzer = entry.get("fuzzer", "")
dns_a = entry.get("dns_a", [])
dns_mx = entry.get("dns_mx", [])
ssdeep_score = entry.get("ssdeep_score", 0)
risk_score = 0
risk_factors = []
# High similarity to legitimate site
if ssdeep_score and ssdeep_score > 50:
risk_score += 40
risk_factors.append(f"high web similarity ({ssdeep_score}%)")
# Has MX records (can receive email / phishing)
if dns_mx:
risk_score += 20
risk_factors.append("has MX records (email capable)")
# Recently registered (if whois data available)
whois_created = entry.get("whois_created", "")
if whois_created:
try:
created = datetime.fromisoformat(whois_created.replace("Z", "+00:00"))
age_days = (datetime.now(created.tzinfo) - created).days
if age_days < 30:
risk_score += 30
risk_factors.append(f"recently registered ({age_days} days)")
elif age_days < 90:
risk_score += 15
risk_factors.append(f"registered {age_days} days ago")
except (ValueError, TypeError):
pass
# Homoglyph attacks are highest risk
if fuzzer == "homoglyph":
risk_score += 25
risk_factors.append("homoglyph (visually identical)")
elif fuzzer in ("addition", "replacement", "transposition"):
risk_score += 10
risk_factors.append(f"permutation type: {fuzzer}")
# Not pointing to legitimate infrastructure
if dns_a and not set(dns_a).intersection(legitimate_ips):
risk_score += 10
risk_factors.append("different IP from legitimate")
entry["risk_score"] = risk_score
entry["risk_factors"] = risk_factors
if risk_score >= 50:
high_risk.append(entry)
elif risk_score >= 25:
medium_risk.append(entry)
else:
low_risk.append(entry)
high_risk.sort(key=lambda x: x["risk_score"], reverse=True)
medium_risk.sort(key=lambda x: x["risk_score"], reverse=True)
print(f"\n=== Typosquatting Analysis ===")
print(f"High Risk: {len(high_risk)}")
print(f"Medium Risk: {len(medium_risk)}")
print(f"Low Risk: {len(low_risk)}")
if high_risk:
print(f"\n--- High Risk Domains ---")
for entry in high_risk[:10]:
print(f" {entry['domain']} (score: {entry['risk_score']})")
for factor in entry['risk_factors']:
print(f" - {factor}")
return {"high": high_risk, "medium": medium_risk, "low": low_risk}
analysis = analyze_results(results, legitimate_ips={"93.184.216.34"})
```
### Step 3: Continuous Monitoring Pipeline
```python
import time
import hashlib
class TyposquatMonitor:
def __init__(self, domains, known_domains_file="known_typosquats.json"):
self.domains = domains
self.known_file = known_domains_file
self.known_domains = self._load_known()
def _load_known(self):
try:
with open(self.known_file, "r") as f:
return json.load(f)
except FileNotFoundError:
return {}
def _save_known(self):
with open(self.known_file, "w") as f:
json.dump(self.known_domains, f, indent=2)
def scan_all_domains(self):
"""Scan all monitored domains for new typosquats."""
new_findings = []
for domain in self.domains:
results = run_dnstwist_scan(domain)
for entry in results:
domain_key = entry.get("domain", "")
if domain_key not in self.known_domains:
entry["first_seen"] = datetime.now().isoformat()
entry["monitored_domain"] = domain
self.known_domains[domain_kRelated in Ads & Marketing
ads
IncludedMulti-platform paid advertising audit and optimization skill. Analyzes Google, Meta, YouTube, LinkedIn, TikTok, Microsoft, and Apple Ads. 250+ checks with scoring, parallel agents, industry templates, and AI creative generation.
banana
IncludedAI image generation Creative Director powered by Google Gemini Nano Banana models. Use this skill for ANY request involving image creation, editing, visual asset production, or creative direction. Triggers on: generate an image, create a photo, edit this picture, design a logo, make a banner, visual for my anything, and all /banana commands. Handles text-to-image, image editing, multi-turn creative sessions, batch workflows, and brand presets.
rpg-migration-analyzer
IncludedAnalyzes legacy RPG (Report Program Generator) programs from AS/400 and IBM i systems for migration to modern Java applications. Extracts business logic from RPG III/IV/ILE source code, identifies data structures (D-specs), file operations (F-specs), program dependencies (CALLB/CALLP), and converts RPG constructs to Java equivalents. Generates migration reports, complexity estimates, and Java implementation strategies with POJO classes, JPA entities, and service methods. Use when modernizing AS/400 or IBM i legacy systems, analyzing RPG source files (.rpg, .rpgle, .RPGLE), converting RPG to Java, mapping data specifications to Java classes, planning legacy system migration, or when user mentions RPG analysis, Report Program Generator, RPG III/IV/ILE, AS/400 modernization, IBM i migration, packed decimal conversion, or mainframe application rewrite.
brand-library-architect
IncludedBuild a complete brand library for a product — visual asset render pipeline, brand documentation set (BRAND, COPY, MANIFESTO, BIOS, FAQ, GLOSSARY, TONE, PRICING), open-source convention files (README, CONTRIBUTING, SECURITY, CODE_OF_CONDUCT), and a self-contained press kit. This skill should be used when the user asks to "build a brand library / brand kit / press kit / brand assets" for a product, "set up a brand library workflow," "create a positioning manifesto plus visual identity," or any combination of brand documentation + visual asset pipeline. Apply phase-by-phase or run end-to-end. Templates are product-agnostic and use {{TOKEN}} placeholders the skill prompts the user to fill.
writing-tech-post
IncludedAuthors engineering blog posts end-to-end: launch deep-dives, incident postmortems, architecture migrations, performance case studies, tutorials, AI/agent system writeups, security disclosures, and research-to-product translations. Picks the correct archetype, plans the abstraction ladder, enforces an evidence cadence (diagrams, benchmarks, profiles, traces, code, ablations), tunes voice against publisher house styles (Datadog, Vercel, GitHub, AWS, Meta, Cloudflare, Jane Street), and runs a pre-publish gate for narrative momentum and disclosure ethics. Use when drafting a new engineering post, restructuring a draft that feels flat, deciding which evidence form belongs where, validating that depth and product context are balanced, or preparing a postmortem, migration, or performance narrative for external publication. Do not use for API reference documentation, README authoring, marketing copy, release notes, generic SEO content, ghost-written executive thought leadership, or non-engineering long-form essays.
blog-google
IncludedGoogle API integration for blog performance: PageSpeed Insights, CrUX Core Web Vitals with 25-week history, Search Console performance, URL Inspection, Indexing API, GA4 organic traffic, NLP entity analysis for E-E-A-T, YouTube video search for embedding, and Google Ads Keyword Planner. Progressive feature availability based on credential tier (API key, OAuth/service account, GA4, Ads). Shares config with claude-seo at ~/.config/claude-seo/google-api.json. Use when user says "google data", "page speed", "core web vitals", "search console", "indexation", "GA4", "keyword research", "nlp entities", "blog performance", "youtube search", "google api setup".