Claude
Skills
Sign in
Back

devops-expert

Included with Lifetime
$97 forever

Expert-level DevOps practices, culture, automation, and continuous delivery

devopsdevopsci-cdautomationinfrastructureculture

What this skill does


# DevOps Expert

Expert guidance for DevOps practices, culture, CI/CD pipelines, infrastructure automation, and operational excellence.

## Core Concepts

### DevOps Culture
- Collaboration and communication
- Shared responsibility
- Continuous improvement
- Breaking down silos
- Blameless culture
- Measuring everything

### Automation
- Infrastructure as Code (IaC)
- Configuration management
- Deployment automation
- Testing automation
- Monitoring automation
- Self-service platforms

### CI/CD
- Continuous Integration
- Continuous Delivery
- Continuous Deployment
- Pipeline as Code
- Artifact management
- Release strategies

## CI/CD Pipeline

```yaml
# GitHub Actions Example
name: CI/CD Pipeline

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Setup Node.js
        uses: actions/setup-node@v3
        with:
          node-version: '18'
          cache: 'npm'

      - name: Install dependencies
        run: npm ci

      - name: Run linting
        run: npm run lint

      - name: Run tests
        run: npm test

      - name: Run security scan
        run: npm audit

      - name: Upload coverage
        uses: codecov/codecov-action@v3

  build:
    needs: test
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write

    steps:
      - uses: actions/checkout@v3

      - name: Log in to Container Registry
        uses: docker/login-action@v2
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Extract metadata
        id: meta
        uses: docker/metadata-action@v4
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}

      - name: Build and push Docker image
        uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}

  deploy-staging:
    needs: build
    if: github.ref == 'refs/heads/develop'
    runs-on: ubuntu-latest
    environment: staging

    steps:
      - name: Deploy to staging
        run: |
          kubectl set image deployment/myapp \
            myapp=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }} \
            --namespace=staging

      - name: Wait for rollout
        run: kubectl rollout status deployment/myapp -n staging

      - name: Run smoke tests
        run: npm run test:smoke

  deploy-production:
    needs: build
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    environment: production

    steps:
      - name: Deploy to production
        run: |
          kubectl set image deployment/myapp \
            myapp=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }} \
            --namespace=production

      - name: Wait for rollout
        run: kubectl rollout status deployment/myapp -n production
```

## Infrastructure as Code

```python
# Pulumi Infrastructure as Code
import pulumi
import pulumi_aws as aws

# VPC
vpc = aws.ec2.Vpc("main-vpc",
    cidr_block="10.0.0.0/16",
    enable_dns_hostnames=True,
    enable_dns_support=True,
    tags={"Name": "main-vpc"})

# Subnets
public_subnet = aws.ec2.Subnet("public-subnet",
    vpc_id=vpc.id,
    cidr_block="10.0.1.0/24",
    availability_zone="us-east-1a",
    map_public_ip_on_launch=True,
    tags={"Name": "public-subnet"})

private_subnet = aws.ec2.Subnet("private-subnet",
    vpc_id=vpc.id,
    cidr_block="10.0.2.0/24",
    availability_zone="us-east-1b",
    tags={"Name": "private-subnet"})

# Internet Gateway
igw = aws.ec2.InternetGateway("igw",
    vpc_id=vpc.id,
    tags={"Name": "main-igw"})

# Route Table
route_table = aws.ec2.RouteTable("public-rt",
    vpc_id=vpc.id,
    routes=[
        aws.ec2.RouteTableRouteArgs(
            cidr_block="0.0.0.0/0",
            gateway_id=igw.id,
        )
    ],
    tags={"Name": "public-rt"})

# Security Group
security_group = aws.ec2.SecurityGroup("web-sg",
    vpc_id=vpc.id,
    description="Allow HTTP and HTTPS traffic",
    ingress=[
        aws.ec2.SecurityGroupIngressArgs(
            protocol="tcp",
            from_port=80,
            to_port=80,
            cidr_blocks=["0.0.0.0/0"],
        ),
        aws.ec2.SecurityGroupIngressArgs(
            protocol="tcp",
            from_port=443,
            to_port=443,
            cidr_blocks=["0.0.0.0/0"],
        ),
    ],
    egress=[
        aws.ec2.SecurityGroupEgressArgs(
            protocol="-1",
            from_port=0,
            to_port=0,
            cidr_blocks=["0.0.0.0/0"],
        )
    ])

# EKS Cluster
cluster = aws.eks.Cluster("app-cluster",
    role_arn=cluster_role.arn,
    vpc_config=aws.eks.ClusterVpcConfigArgs(
        subnet_ids=[public_subnet.id, private_subnet.id],
        security_group_ids=[security_group.id],
    ))

# Export outputs
pulumi.export("vpc_id", vpc.id)
pulumi.export("cluster_name", cluster.name)
pulumi.export("cluster_endpoint", cluster.endpoint)
```

## Deployment Strategies

```python
from typing import List, Dict
import time

class DeploymentStrategy:
    """Implement various deployment strategies"""

    def __init__(self, service_name: str):
        self.service_name = service_name

    def blue_green_deployment(self, blue_version: str, green_version: str):
        """Blue-Green deployment"""
        # Deploy green environment
        self.deploy_environment("green", green_version)

        # Run tests on green
        if self.run_tests("green"):
            # Switch traffic to green
            self.switch_traffic("green")

            # Keep blue for rollback
            print(f"Deployment successful. Blue ({blue_version}) kept for rollback.")
        else:
            # Rollback - keep blue active
            print("Tests failed on green. Keeping blue active.")

    def canary_deployment(self, current_version: str, new_version: str,
                         canary_percentage: int = 10):
        """Canary deployment"""
        # Deploy canary with small percentage
        self.deploy_canary(new_version, canary_percentage)

        # Monitor metrics
        metrics = self.monitor_canary_metrics(duration_minutes=10)

        if metrics['error_rate'] < 0.1 and metrics['latency_p95'] < 500:
            # Gradually increase canary traffic
            for percentage in [25, 50, 75, 100]:
                self.update_canary_traffic(percentage)
                time.sleep(300)  # 5 minutes between increases

                if not self.check_health():
                    self.rollback(current_version)
                    return False

            print(f"Canary deployment successful: {new_version}")
            return True
        else:
            self.rollback(current_version)
            print("Canary deployment failed - rolled back")
            return False

    def rolling_deployment(self, version: str, batch_size: int = 1):
        """Rolling deployment"""
        instances = self.get_instances()

        for i in range(0, len(instances), batch_size):
            batch = instances[i:i + batch_size]

            # Update batch
            for instance in batch:
                self.update_instance(instance, version)
                self.wait_for_healthy(instance)

            # Verify batch health
            if not self.check_health():
                print(f"Rolling deployment failed at batch {i//batch_size + 1}")
                return False

        print(f"Rolling deployment successful: {version}")
        return True

    def feature_flag_deployment(self, feature_name: str, enabled: bool,
                               rollout_percentage: int = 100):
        """Feature flag based deployment"""
        return {
            'feature': feature_name,
            'enabled': enabled,
            'roll

Related in devops