performing-cloud-asset-inventory-with-cartography
Perform comprehensive cloud asset inventory and relationship mapping using Cartography to build a Neo4j security graph of infrastructure assets, IAM permissions, and attack paths across AWS, GCP, and Azure.
What this skill does
# Performing Cloud Asset Inventory with Cartography
## Overview
Cartography is a CNCF sandbox project (originally created at Lyft) that consolidates infrastructure assets and their relationships into a Neo4j graph database. It queries cloud APIs to discover resources, maps relationships between them, and enables security teams to identify attack paths, generate asset reports, and find areas for security improvement. The graph model reveals hidden connections such as IAM permission chains, network paths, and cross-account trust relationships.
## When to Use
- When conducting security assessments that involve performing cloud asset inventory with cartography
- When following incident response procedures for related security events
- When performing scheduled security testing or auditing activities
- When validating security controls through hands-on testing
## Prerequisites
- Python 3.8+
- Neo4j 4.x or 5.x database
- Cloud provider credentials (AWS, GCP, Azure)
- Docker (optional, for Neo4j deployment)
- Minimum 4GB RAM for Neo4j, more for large environments
## Installation
```bash
# Install Cartography
pip install cartography
# Verify installation
cartography --help
```
### Deploy Neo4j with Docker
```bash
docker run -d \
--name neo4j \
-p 7474:7474 -p 7687:7687 \
-e NEO4J_AUTH=neo4j/changethispassword \
-e NEO4J_PLUGINS='["apoc"]' \
-v neo4j_data:/data \
neo4j:5-community
```
## Running Cartography
### Basic AWS Sync
```bash
# Sync AWS account data to Neo4j
cartography \
--neo4j-uri bolt://localhost:7687 \
--neo4j-user neo4j \
--neo4j-password-env-var NEO4J_PASSWORD
```
### Sync specific AWS modules
```bash
cartography \
--neo4j-uri bolt://localhost:7687 \
--neo4j-user neo4j \
--neo4j-password-env-var NEO4J_PASSWORD \
--aws-sync-all-profiles
```
### GCP Sync
```bash
cartography \
--neo4j-uri bolt://localhost:7687 \
--neo4j-user neo4j \
--neo4j-password-env-var NEO4J_PASSWORD \
--gcp-requested-syncs compute iam storage
```
## Security-Focused Cypher Queries
### Find all S3 buckets with public access
```cypher
MATCH (b:S3Bucket)
WHERE b.anonymous_access = true
OR b.anonymous_actions IS NOT NULL
RETURN b.name, b.anonymous_actions, b.region, b.arn
ORDER BY b.name
```
### Identify IAM users with admin policies
```cypher
MATCH (user:AWSUser)-[:POLICY]->(policy:AWSPolicy)
WHERE policy.name = 'AdministratorAccess'
OR policy.arn CONTAINS 'AdministratorAccess'
RETURN user.name, user.arn, policy.name, user.password_last_used
```
### Find EC2 instances exposed to internet
```cypher
MATCH (instance:EC2Instance)-[:MEMBER_OF_EC2_SECURITY_GROUP]->(sg:EC2SecurityGroup)
-[:MEMBER_OF_EC2_SECURITY_GROUP_RULE]->(rule:IpRule)
WHERE rule.fromport <= 22 AND rule.toport >= 22
AND rule.protocol IN ['tcp', '-1']
AND '0.0.0.0/0' IN rule.ipranges
RETURN instance.instanceid, instance.publicipaddress, sg.groupid, sg.name
```
### Discover cross-account trust relationships
```cypher
MATCH (role:AWSRole)-[:TRUSTS_AWS_PRINCIPAL]->(principal:AWSPrincipal)
WHERE principal.arn CONTAINS ':root'
AND NOT principal.arn CONTAINS role.accountid
RETURN role.arn, role.name, principal.arn AS trusted_account
ORDER BY role.name
```
### Find attack path from public EC2 to sensitive S3
```cypher
MATCH path = (instance:EC2Instance)-[:STS_ASSUME_ROLE_ALLOWS|MEMBER_OF_EC2_SECURITY_GROUP|
POLICY|INSTANCE_PROFILE*1..5]->(bucket:S3Bucket)
WHERE instance.publicipaddress IS NOT NULL
AND bucket.name CONTAINS 'sensitive'
RETURN path
LIMIT 25
```
### Identify unused IAM roles
```cypher
MATCH (role:AWSRole)
WHERE role.last_used IS NULL
OR role.last_used < datetime().epochMillis - (90 * 24 * 60 * 60 * 1000)
RETURN role.name, role.arn, role.last_used
ORDER BY role.last_used
```
### Find Lambda functions with overprivileged roles
```cypher
MATCH (func:AWSLambda)-[:STS_ASSUME_ROLE_ALLOWS]->(role:AWSRole)-[:POLICY]->(policy:AWSPolicy)
WHERE policy.name = 'AdministratorAccess'
RETURN func.name, func.arn, role.name, policy.name
```
### Network path analysis
```cypher
MATCH (vpc:AWSVpc)-[:RESOURCE]->(subnet:EC2Subnet)-[:MEMBER_OF_SUBNET]->(instance:EC2Instance)
WHERE instance.publicipaddress IS NOT NULL
RETURN vpc.id, subnet.subnetid, subnet.cidr_block, instance.instanceid,
instance.publicipaddress, instance.state
```
## Scheduling Regular Syncs
### Cron-based sync
```bash
# Add to crontab - sync every 6 hours
0 */6 * * * /usr/local/bin/cartography \
--neo4j-uri bolt://localhost:7687 \
--neo4j-user neo4j \
--neo4j-password-env-var NEO4J_PASSWORD \
>> /var/log/cartography/sync.log 2>&1
```
### Docker Compose deployment
```yaml
version: '3.8'
services:
neo4j:
image: neo4j:5-community
ports:
- "7474:7474"
- "7687:7687"
environment:
NEO4J_AUTH: neo4j/securepwd123
NEO4J_PLUGINS: '["apoc"]'
NEO4J_dbms_memory_heap_max__size: 4G
volumes:
- neo4j_data:/data
cartography:
image: ghcr.io/cartography-cncf/cartography:latest
depends_on:
- neo4j
environment:
NEO4J_PASSWORD: securepwd123
AWS_DEFAULT_REGION: us-east-1
command: >
--neo4j-uri bolt://neo4j:7687
--neo4j-user neo4j
--neo4j-password-env-var NEO4J_PASSWORD
volumes:
neo4j_data:
```
## Data Model Overview
### Key Node Types
- `AWSAccount`, `GCPProject`, `AzureSubscription`
- `EC2Instance`, `S3Bucket`, `RDSInstance`, `AWSLambda`
- `AWSUser`, `AWSRole`, `AWSGroup`, `AWSPolicy`
- `EC2SecurityGroup`, `EC2Subnet`, `AWSVpc`
- `GCPInstance`, `GCSBucket`, `GCPRole`
### Key Relationship Types
- `RESOURCE`: Account owns resource
- `POLICY`: Principal has policy attached
- `STS_ASSUME_ROLE_ALLOWS`: Principal can assume role
- `MEMBER_OF_EC2_SECURITY_GROUP`: Instance belongs to SG
- `TRUSTS_AWS_PRINCIPAL`: Cross-account trust
## References
- Cartography GitHub: https://github.com/cartography-cncf/cartography
- Cartography Documentation: https://cartography.dev
- CNCF Sandbox Project
- Neo4j Cypher Query Language Reference
Related in Cloud & DevOps
appbuilder-action-scaffolder
IncludedCreate, implement, deploy, and debug Adobe Runtime actions with consistent layout, validation, and error handling. Use this skill whenever the user needs to add actions to an App Builder project, understand action structure (params, response format, web/raw actions), configure actions in the manifest, use App Builder SDKs (State, Files, Events, database), deploy and invoke actions via CLI, debug action issues, or implement patterns such as webhook receivers, custom event providers, journaling consumers, large payload redirects, action sequence pipelines, and Asset Compute workers. Also trigger when users mention serverless functions in Adobe context, action logging, IMS authentication for actions, or cron-style scheduled actions.
orchestrating-datacloud
IncludedSalesforce Data Cloud product orchestrator for connect→prepare→harmonize→segment→act workflows. Use this skill when the user needs a multi-step Data Cloud pipeline, cross-phase troubleshooting, or data space and data kit management. TRIGGER when: user needs a multi-step Data Cloud pipeline, asks to set up or troubleshoot Data Cloud across phases, manages data spaces or data kits, or wants a cross-phase sf data360 workflow. DO NOT TRIGGER when: work is isolated to a single phase (use the matching phase-specific skill), the task is STDM/session tracing/parquet telemetry (use observing-agentforce), standard CRM SOQL (use querying-soql), or Apex implementation (use generating-apex).
github-project-automation
IncludedAutomate GitHub repository setup with CI/CD workflows, issue templates, Dependabot, and CodeQL security scanning. Includes 12 production-tested workflows and prevents 18 errors: YAML syntax, action pinning, and configuration. Use when: setting up GitHub Actions CI/CD, creating issue/PR templates, enabling Dependabot or CodeQL scanning, deploying to Cloudflare Workers, implementing matrix testing, or troubleshooting YAML indentation, action version pinning, secrets syntax, runner versions, or CodeQL configuration. Keywords: github actions, github workflow, ci/cd, issue templates, pull request templates, dependabot, codeql, security scanning, yaml syntax, github automation, repository setup, workflow templates, github actions matrix, secrets management, branch protection, codeowners, github projects, continuous integration, continuous deployment, workflow syntax error, action version pinning, runner version, github context, yaml indentation error
sf-datacloud
IncludedSalesforce Data Cloud product orchestrator for connect→prepare→harmonize→segment→act workflows. TRIGGER when: user needs a multi-step Data Cloud pipeline, asks to set up or troubleshoot Data Cloud across phases, manages data spaces or data kits, or wants a cross-phase `sf data360` workflow. DO NOT TRIGGER when: work is isolated to a single phase (use the matching sf-datacloud-* skill), the task is STDM/session tracing/parquet telemetry (use sf-ai-agentforce-observability), standard CRM SOQL (use sf-soql), or Apex implementation (use sf-apex).
fabric-cli
IncludedUse this skill for Fabric.so CLI workflows with the `fabric` terminal command: diagnose/install/login, search or browse a Fabric library, save notes/links/files, create folders, ask the Fabric AI assistant, manage tasks/workspaces, generate shell completion, check subscription usage, produce JSON output, and use Fabric as persistent agent memory. Do not use for Microsoft Fabric/Azure/Power BI `fab`, Daniel Miessler's Fabric framework, Python Fabric SSH, Fabric.js, or textile/fashion fabric.
lark
IncludedLark/Feishu CLI skills: lark-cli operations for docs, markdown, sheets, base, calendar, im, mail, task, okr, drive, wiki, slides, whiteboard, apps, approval, attendance, contact, vc, minutes, event. Use when the user needs to operate Lark/Feishu resources via lark-cli, send messages, manage documents, spreadsheets, calendars, tasks, OKRs, deploy web pages, or any Feishu/Lark workspace operations.