cost-optimization
Strategies and patterns for optimizing cloud costs across AWS, Azure, and GCP.
What this skill does
# Cloud Cost Optimization
Strategies and patterns for optimizing cloud costs across AWS, Azure, and GCP.
## Do not use this skill when
- The task is unrelated to cloud cost optimization
- You need a different domain or tool outside this scope
## Instructions
- Clarify goals, constraints, and required inputs.
- Apply relevant best practices and validate outcomes.
- Provide actionable steps and verification.
- If detailed examples are required, open `resources/implementation-playbook.md`.
## Purpose
Implement systematic cost optimization strategies to reduce cloud spending while maintaining performance and reliability.
## Use this skill when
- Reduce cloud spending
- Right-size resources
- Implement cost governance
- Optimize multi-cloud costs
- Meet budget constraints
## Cost Optimization Framework
### 1. Visibility
- Implement cost allocation tags
- Use cloud cost management tools
- Set up budget alerts
- Create cost dashboards
### 2. Right-Sizing
- Analyze resource utilization
- Downsize over-provisioned resources
- Use auto-scaling
- Remove idle resources
### 3. Pricing Models
- Use reserved capacity
- Leverage spot/preemptible instances
- Implement savings plans
- Use committed use discounts
### 4. Architecture Optimization
- Use managed services
- Implement caching
- Optimize data transfer
- Use lifecycle policies
## AWS Cost Optimization
### Reserved Instances
```
Savings: 30-72% vs On-Demand
Term: 1 or 3 years
Payment: All/Partial/No upfront
Flexibility: Standard or Convertible
```
### Savings Plans
```
Compute Savings Plans: 66% savings
EC2 Instance Savings Plans: 72% savings
Applies to: EC2, Fargate, Lambda
Flexible across: Instance families, regions, OS
```
### Spot Instances
```
Savings: Up to 90% vs On-Demand
Best for: Batch jobs, CI/CD, stateless workloads
Risk: 2-minute interruption notice
Strategy: Mix with On-Demand for resilience
```
### S3 Cost Optimization
```hcl
resource "aws_s3_bucket_lifecycle_configuration" "example" {
bucket = aws_s3_bucket.example.id
rule {
id = "transition-to-ia"
status = "Enabled"
transition {
days = 30
storage_class = "STANDARD_IA"
}
transition {
days = 90
storage_class = "GLACIER"
}
expiration {
days = 365
}
}
}
```
## Azure Cost Optimization
### Reserved VM Instances
- 1 or 3 year terms
- Up to 72% savings
- Flexible sizing
- Exchangeable
### Azure Hybrid Benefit
- Use existing Windows Server licenses
- Up to 80% savings with RI
- Available for Windows and SQL Server
### Azure Advisor Recommendations
- Right-size VMs
- Delete unused resources
- Use reserved capacity
- Optimize storage
## GCP Cost Optimization
### Committed Use Discounts
- 1 or 3 year commitment
- Up to 57% savings
- Applies to vCPUs and memory
- Resource-based or spend-based
### Sustained Use Discounts
- Automatic discounts
- Up to 30% for running instances
- No commitment required
- Applies to Compute Engine, GKE
### Preemptible VMs
- Up to 80% savings
- 24-hour maximum runtime
- Best for batch workloads
## Tagging Strategy
### AWS Tagging
```hcl
locals {
common_tags = {
Environment = "production"
Project = "my-project"
CostCenter = "engineering"
Owner = "[email protected]"
ManagedBy = "terraform"
}
}
resource "aws_instance" "example" {
ami = "ami-12345678"
instance_type = "t3.medium"
tags = merge(
local.common_tags,
{
Name = "web-server"
}
)
}
```
**Reference:** See `references/tagging-standards.md`
## Cost Monitoring
### Budget Alerts
```hcl
# AWS Budget
resource "aws_budgets_budget" "monthly" {
name = "monthly-budget"
budget_type = "COST"
limit_amount = "1000"
limit_unit = "USD"
time_period_start = "2024-01-01_00:00"
time_unit = "MONTHLY"
notification {
comparison_operator = "GREATER_THAN"
threshold = 80
threshold_type = "PERCENTAGE"
notification_type = "ACTUAL"
subscriber_email_addresses = ["[email protected]"]
}
}
```
### Cost Anomaly Detection
- AWS Cost Anomaly Detection
- Azure Cost Management alerts
- GCP Budget alerts
## Architecture Patterns
### Pattern 1: Serverless First
- Use Lambda/Functions for event-driven
- Pay only for execution time
- Auto-scaling included
- No idle costs
### Pattern 2: Right-Sized Databases
```
Development: t3.small RDS
Staging: t3.large RDS
Production: r6g.2xlarge RDS with read replicas
```
### Pattern 3: Multi-Tier Storage
```
Hot data: S3 Standard
Warm data: S3 Standard-IA (30 days)
Cold data: S3 Glacier (90 days)
Archive: S3 Deep Archive (365 days)
```
### Pattern 4: Auto-Scaling
```hcl
resource "aws_autoscaling_policy" "scale_up" {
name = "scale-up"
scaling_adjustment = 2
adjustment_type = "ChangeInCapacity"
cooldown = 300
autoscaling_group_name = aws_autoscaling_group.main.name
}
resource "aws_cloudwatch_metric_alarm" "cpu_high" {
alarm_name = "cpu-high"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = "2"
metric_name = "CPUUtilization"
namespace = "AWS/EC2"
period = "60"
statistic = "Average"
threshold = "80"
alarm_actions = [aws_autoscaling_policy.scale_up.arn]
}
```
## Cost Optimization Checklist
- [ ] Implement cost allocation tags
- [ ] Delete unused resources (EBS, EIPs, snapshots)
- [ ] Right-size instances based on utilization
- [ ] Use reserved capacity for steady workloads
- [ ] Implement auto-scaling
- [ ] Optimize storage classes
- [ ] Use lifecycle policies
- [ ] Enable cost anomaly detection
- [ ] Set budget alerts
- [ ] Review costs weekly
- [ ] Use spot/preemptible instances
- [ ] Optimize data transfer costs
- [ ] Implement caching layers
- [ ] Use managed services
- [ ] Monitor and optimize continuously
## Tools
- **AWS:** Cost Explorer, Cost Anomaly Detection, Compute Optimizer
- **Azure:** Cost Management, Advisor
- **GCP:** Cost Management, Recommender
- **Multi-cloud:** CloudHealth, Cloudability, Kubecost
## Reference Files
- `references/tagging-standards.md` - Tagging conventions
- `assets/cost-analysis-template.xlsx` - Cost analysis spreadsheet
## Related Skills
- `terraform-module-library` - For resource provisioning
- `multi-cloud-architecture` - For cloud selection
## Limitations
- Use this skill only when the task clearly matches the scope described above.
- Do not treat the output as a substitute for environment-specific validation, testing, or expert review.
- Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.
Related in Cloud & DevOps
appbuilder-action-scaffolder
IncludedCreate, implement, deploy, and debug Adobe Runtime actions with consistent layout, validation, and error handling. Use this skill whenever the user needs to add actions to an App Builder project, understand action structure (params, response format, web/raw actions), configure actions in the manifest, use App Builder SDKs (State, Files, Events, database), deploy and invoke actions via CLI, debug action issues, or implement patterns such as webhook receivers, custom event providers, journaling consumers, large payload redirects, action sequence pipelines, and Asset Compute workers. Also trigger when users mention serverless functions in Adobe context, action logging, IMS authentication for actions, or cron-style scheduled actions.
orchestrating-datacloud
IncludedSalesforce Data Cloud product orchestrator for connect→prepare→harmonize→segment→act workflows. Use this skill when the user needs a multi-step Data Cloud pipeline, cross-phase troubleshooting, or data space and data kit management. TRIGGER when: user needs a multi-step Data Cloud pipeline, asks to set up or troubleshoot Data Cloud across phases, manages data spaces or data kits, or wants a cross-phase sf data360 workflow. DO NOT TRIGGER when: work is isolated to a single phase (use the matching phase-specific skill), the task is STDM/session tracing/parquet telemetry (use observing-agentforce), standard CRM SOQL (use querying-soql), or Apex implementation (use generating-apex).
github-project-automation
IncludedAutomate GitHub repository setup with CI/CD workflows, issue templates, Dependabot, and CodeQL security scanning. Includes 12 production-tested workflows and prevents 18 errors: YAML syntax, action pinning, and configuration. Use when: setting up GitHub Actions CI/CD, creating issue/PR templates, enabling Dependabot or CodeQL scanning, deploying to Cloudflare Workers, implementing matrix testing, or troubleshooting YAML indentation, action version pinning, secrets syntax, runner versions, or CodeQL configuration. Keywords: github actions, github workflow, ci/cd, issue templates, pull request templates, dependabot, codeql, security scanning, yaml syntax, github automation, repository setup, workflow templates, github actions matrix, secrets management, branch protection, codeowners, github projects, continuous integration, continuous deployment, workflow syntax error, action version pinning, runner version, github context, yaml indentation error
sf-datacloud
IncludedSalesforce Data Cloud product orchestrator for connect→prepare→harmonize→segment→act workflows. TRIGGER when: user needs a multi-step Data Cloud pipeline, asks to set up or troubleshoot Data Cloud across phases, manages data spaces or data kits, or wants a cross-phase `sf data360` workflow. DO NOT TRIGGER when: work is isolated to a single phase (use the matching sf-datacloud-* skill), the task is STDM/session tracing/parquet telemetry (use sf-ai-agentforce-observability), standard CRM SOQL (use sf-soql), or Apex implementation (use sf-apex).
fabric-cli
IncludedUse this skill for Fabric.so CLI workflows with the `fabric` terminal command: diagnose/install/login, search or browse a Fabric library, save notes/links/files, create folders, ask the Fabric AI assistant, manage tasks/workspaces, generate shell completion, check subscription usage, produce JSON output, and use Fabric as persistent agent memory. Do not use for Microsoft Fabric/Azure/Power BI `fab`, Daniel Miessler's Fabric framework, Python Fabric SSH, Fabric.js, or textile/fashion fabric.
lark
IncludedLark/Feishu CLI skills: lark-cli operations for docs, markdown, sheets, base, calendar, im, mail, task, okr, drive, wiki, slides, whiteboard, apps, approval, attendance, contact, vc, minutes, event. Use when the user needs to operate Lark/Feishu resources via lark-cli, send messages, manage documents, spreadsheets, calendars, tasks, OKRs, deploy web pages, or any Feishu/Lark workspace operations.