k8s-cluster-api
Kubernetes Cluster API v1.12. Covers clusterctl CLI, ClusterClass, GitOps integration. Scripts for health checks, backup, migration, linting. Templates: clusters, DR, Prometheus. Use when provisioning, upgrading, or operating Kubernetes clusters with CAPI, or running clusterctl and ClusterClass workflows. Keywords: CAPI, clusterctl, kubeadm, cluster lifecycle.
What this skill does
# Kubernetes Cluster API
Kubernetes Cluster API (CAPI) is a Kubernetes sub-project focused on providing declarative APIs and tooling to simplify provisioning, upgrading, and operating multiple Kubernetes clusters.
## Overview
Started by SIG Cluster Lifecycle, Cluster API uses Kubernetes-style APIs and patterns to automate cluster lifecycle management. The infrastructure (VMs, networks, load balancers, VPCs) and Kubernetes configuration are defined declaratively, enabling consistent and repeatable cluster deployments across environments.
### Why Cluster API?
While kubeadm reduces installation complexity, it doesn't address day-to-day cluster management:
- How to consistently provision infrastructure across providers and locations?
- How to automate cluster lifecycle (upgrades, deletion)?
- How to scale processes to manage any number of clusters?
Cluster API addresses these gaps with declarative, Kubernetes-style APIs that automate cluster creation, configuration, and management.
### Goals
- Manage lifecycle (create, scale, upgrade, destroy) of Kubernetes-conformant clusters via declarative API
- Work in different environments (on-premises and cloud)
- Define common operations with swappable implementations
- Reuse existing ecosystem components (cluster-autoscaler, node-problem-detector)
- Provide transition path for existing tools to adopt incrementally
### Non-Goals
- Add APIs to Kubernetes core
- Manage infrastructure unrelated to Kubernetes clusters
- Force all lifecycle products to use these APIs
- Manage non-CAPI provisioned clusters
- Manage single cluster spanning multiple providers
- Configure machines after create/upgrade
## Quick Navigation
| Topic | Reference |
| ---------------------------- | --------------------------------------------------------- |
| Getting Started | [getting-started.md](references/getting-started.md) |
| Concepts & Architecture | [concepts.md](references/concepts.md) |
| Certificates | [certificates.md](references/certificates.md) |
| Bootstrap (Kubeadm/MicroK8s) | [bootstrap.md](references/bootstrap.md) |
| Cluster Operations | [cluster-operations.md](references/cluster-operations.md) |
| Experimental Features | [experimental.md](references/experimental.md) |
| clusterctl CLI | [clusterctl.md](references/clusterctl.md) |
| Developer Guide | [developer.md](references/developer.md) |
| Troubleshooting | [troubleshooting.md](references/troubleshooting.md) |
| API Reference & Providers | [api-reference.md](references/api-reference.md) |
| Security & PSS | [security.md](references/security.md) |
| Controllers | [controllers.md](references/controllers.md) |
| Version Migrations | [migrations.md](references/migrations.md) |
| FAQ | [faq.md](references/faq.md) |
| Best Practices | [best-practices.md](references/best-practices.md) |
## When to Use
- Provisioning Kubernetes clusters across multiple infrastructure providers
- Managing cluster lifecycle (create, scale, upgrade, destroy)
- Automating cluster operations with declarative APIs
- Implementing GitOps workflows for cluster management
- Building custom infrastructure providers
## Core Concepts
### Architecture
```
┌─────────────────────────────────────────┐
│ Management Cluster │
│ ┌─────────────┐ ┌─────────────────┐ │
│ │ CAPI Core │ │ Infrastructure │ │
│ │ Controllers │ │ Provider │ │
│ └─────────────┘ └─────────────────┘ │
│ ┌─────────────┐ ┌─────────────────┐ │
│ │ Bootstrap │ │ Control Plane │ │
│ │ Provider │ │ Provider │ │
│ └─────────────┘ └─────────────────┘ │
└─────────────────────┬───────────────────┘
│ manages
┌───────────┴───────────┐
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Workload │ │ Workload │
│ Cluster 1 │ │ Cluster N │
└─────────────────┘ └─────────────────┘
```
### Key Components
| Component | Purpose |
| ----------------------- | ----------------------------------------- |
| Management Cluster | Hosts CAPI controllers, manages workloads |
| Workload Cluster | User clusters managed by CAPI |
| Infrastructure Provider | Provisions VMs, networks, load balancers |
| Bootstrap Provider | Generates cloud-init/ignition configs |
| Control Plane Provider | Manages control plane nodes lifecycle |
### Core Resources
| Resource | Description |
| ------------------ | ---------------------------------------- |
| Cluster | Represents a Kubernetes cluster |
| Machine | Represents a single node/VM |
| MachineSet | Manages replicas of Machines |
| MachineDeployment | Declarative updates for MachineSets |
| MachineHealthCheck | Automatic remediation of unhealthy nodes |
## Quick Start
```bash
# Install clusterctl
curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.12.0/clusterctl-linux-amd64 -o clusterctl
chmod +x clusterctl
sudo mv clusterctl /usr/local/bin/
# Initialize management cluster
clusterctl init --infrastructure docker
# Create workload cluster
clusterctl generate cluster my-cluster --kubernetes-version v1.32.0 --control-plane-machine-count 1 --worker-machine-count 3 | kubectl apply -f -
# Get cluster kubeconfig
clusterctl get kubeconfig my-cluster > my-cluster.kubeconfig
# Delete cluster
kubectl delete cluster my-cluster
```
## Common Workflows
### Cluster Lifecycle
```bash
# Create cluster from template
clusterctl generate cluster prod-cluster \
--infrastructure aws \
--kubernetes-version v1.32.0 \
--control-plane-machine-count 3 \
--worker-machine-count 5 \
| kubectl apply -f -
# Scale workers
kubectl scale machinedeployment prod-cluster-md-0 --replicas=10
# Upgrade Kubernetes version
kubectl patch cluster prod-cluster --type merge -p '{"spec":{"topology":{"version":"v1.33.0"}}}'
# Move cluster to new management cluster
clusterctl move --to-kubeconfig target-mgmt.kubeconfig
```
### Health Monitoring
```yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineHealthCheck
metadata:
name: my-cluster-mhc
spec:
clusterName: my-cluster
maxUnhealthy: 40%
nodeStartupTimeout: 10m
selector:
matchLabels:
cluster.x-k8s.io/cluster-name: my-cluster
unhealthyConditions:
- type: Ready
status: "False"
timeout: 5m
- type: Ready
status: Unknown
timeout: 5m
```
## Critical Prohibitions
- Do NOT modify management cluster directly without proper backup
- Do NOT delete Machine objects directly (use MachineDeployment scale)
- Do NOT mix provider versions without checking compatibility
- Do NOT skip cluster upgrade steps (control plane before workers)
- Do NOT ignore MachineHealthCheck alerts
## Release Highlights (1.13.x)
- Kubernetes compatibility moves to management clusters `v1.32.x -> v1.36.x` and workload clusters `v1.30.x -> v1.36.x` by the `1.13.2` line.
- `v1alpha3` and `v1alpha4` API versions are now removed; providers should keep moving toward the `v1beta2` contract because `v1beta1` remains on the path to becoming unserved in a later release.
- Cluster topology can now drive `rolloutAfter` for both control plane and `MachineDeployment` resources.
- KubeadmControlPlane improves remediation tolerance for multiple failures and better surfaces common join/remediation symptoms.
- `PriorityQueue` and `ReconcilerRateLimiting`Related in Backend & APIs
jfrog
IncludedInteract with the JFrog Platform via the JFrog CLI and REST/GraphQL APIs. Use this skill when the user wants to manage Artifactory repositories, upload or download artifacts, manage builds, configure permissions, manage users and groups, work with access tokens, configure JFrog CLI servers, search artifacts, manage properties, set up replication, manage JFrog Projects, run security audits or scans, look up CVE details, query exposures scan results from JFrog Advanced Security, manage release bundles and lifecycle operations, aggregate or export platform data, or perform any JFrog Platform administration task. Also use when the user mentions jf, jfrog, artifactory, xray, distribution, evidence, apptrust, onemodel, graphql, workers, mission control, curation, advanced security, exposures, or any JFrog product name.
cupynumeric-migration-readiness
IncludedPre-migration readiness assessor for porting NumPy to cuPyNumeric. Use BEFORE substantial porting work begins when the user asks whether code will scale on GPU, whether they should migrate to cuPyNumeric, which NumPy patterns transfer cleanly, what must be refactored before porting, or mentions pre-port assessment, scaling analysis, or refactor planning. Inspect the user's source code, look up NumPy usage, cross-reference the cuPyNumeric API support manifest, and distinguish distributed-scaling-friendly patterns from blockers such as unsupported APIs, scalar synchronization, host round-trips, Python/object-heavy control flow, shape/data-dependent branching, and in-place mutation hazards. Produce a verdict of READY, LIGHT REFACTOR, SIGNIFICANT REFACTOR, or NOT RECOMMENDED, with concrete refactor pointers.
alibabacloud-data-agent-skill
IncludedInvoke Alibaba Cloud Apsara Data Agent for Analytics via CLI to perform natural language-driven data analysis on enterprise databases. Data Agent for Analytics is an intelligent data analysis agent developed by Alibaba Cloud Database team for enterprise users. It automatically completes requirement analysis, data understanding, analysis insights, and report generation based on natural language descriptions. This tool supports: discovering data resources (instances/databases/tables) managed in DMS, initiating query or deep analysis sessions, real-time progress tracking, and retrieving analysis conclusions and generated reports. Use this Skill when users need to query databases, analyze data trends, generate data reports, ask questions in natural language, or mention "Data Agent", "data analysis", "database query", "SQL analysis", "data insights".
token-optimizer
IncludedReduce OpenClaw token usage and API costs through smart model routing, heartbeat optimization, budget tracking, and native 2026.2.15 features (session pruning, bootstrap size limits, cache TTL alignment). Use when token costs are high, API rate limits are being hit, or hosting multiple agents at scale. The 4 executable scripts (context_optimizer, model_router, heartbeat_optimizer, token_tracker) are local-only — no network requests, no subprocess calls, no system modifications. Reference files (PROVIDERS.md, config-patches.json) document optional multi-provider strategies that require external API keys and network access if you choose to use them. See SECURITY.md for full breakdown.
resend-cli
IncludedUse this skill when the task is specifically about operating Resend from an AI agent, terminal session, or CI job via the official resend CLI: installing/authenticating the CLI, sending/listing/updating/cancelling emails, batch sends, domains and DNS, webhooks and local listeners, inbound receiving, contacts, topics, segments, broadcasts, templates, API keys, profiles, or debugging Resend CLI/API failures. Trigger on mentions of Resend CLI, `resend`, `resend doctor`, `resend emails send`, `resend domains`, `resend webhooks listen`, `resend emails receiving`, or agent-friendly terminal automation.
alibabacloud-odps-maxframe-coding
IncludedUse this skill for MaxFrame SDK development and documentation navigation on Alibaba Cloud MaxCompute (ODPS). Helps answer MaxFrame API, concept, official example, and supported pandas API questions; create data processing programs; read/write MaxCompute tables; debug jobs (remote or local); and build custom DPE runtime images. Trigger when users mention MaxFrame, MaxCompute with MaxFrame, ODPS table processing, DPE runtime, MaxFrame docs/examples, DataFrame/Tensor operations, or GPU runtime setup. Works for both English and Chinese queries about Alibaba Cloud data processing with MaxFrame.