oraclecloud-prod-checklist
Pre-production readiness checklist for OCI — backup policies, security audit, key rotation, encryption, and Cloud Guard. Use when preparing an OCI environment for production workloads or auditing an existing deployment. Trigger with "oraclecloud prod checklist", "oci production ready", "oci security audit", "oci well-architected".
What this skill does
# Oracle Cloud Production Checklist
## Overview
OCI has no "Well-Architected Review" equivalent to AWS. This is the pre-production gate: a comprehensive checklist covering backup policies, security list audit, API key rotation, compartment isolation, boot volume encryption, OS Management agent, Cloud Guard, and Vulnerability Scanning. Every item is verifiable via CLI or Python SDK — no subjective assessments, only pass/fail checks.
**Purpose:** Validate that an OCI environment meets production-grade security, resilience, and operational standards before going live.
## Prerequisites
- **OCI CLI installed and configured** — `~/.oci/config` validated (see `oraclecloud-install-auth`)
- **Python 3.8+** with the OCI SDK — `pip install oci`
- **Administrator-level IAM policies** — the checks require `inspect` and `read` across most service families
- **Target compartment OCID** — the compartment being audited
- **Cloud Guard** must be enabled at the tenancy level (Administration > Cloud Guard)
## Instructions
### Step 1: Compartment Isolation Audit
Production workloads must be in a dedicated compartment, not the root:
```bash
# List compartments — production should NOT be the root compartment
oci iam compartment list \
--compartment-id "$TENANCY_OCID" \
--query 'data[].{name:name, id:id, state:"lifecycle-state"}' \
--output table
# Verify prod compartment has policies restricting access
oci iam policy list \
--compartment-id "$PROD_COMPARTMENT_OCID" \
--query 'data[].{name:name, statements:statements}' \
--output json
```
**Pass criteria:** Production compartment is NOT the root tenancy. Policies follow least-privilege (no `manage all-resources in tenancy`).
### Step 2: Backup Policy Verification
```python
import oci
config = oci.config.from_file("~/.oci/config")
blockstorage = oci.core.BlockstorageClient(config)
# List all boot volumes in prod compartment
boot_volumes = blockstorage.list_boot_volumes(
compartment_id="PROD_COMPARTMENT_OCID",
availability_domain="AD-1",
).data
for vol in boot_volumes:
# Check backup policy assignment
try:
assignments = blockstorage.get_volume_backup_policy_asset_assignment(
asset_id=vol.id
).data
if assignments:
print(f"PASS: {vol.display_name} — backup policy assigned")
else:
print(f"FAIL: {vol.display_name} — no backup policy")
except oci.exceptions.ServiceError:
print(f"FAIL: {vol.display_name} — cannot check backup policy")
```
**Pass criteria:** Every boot volume and block volume has an assigned backup policy (Bronze minimum: weekly backups, 5-week retention).
### Step 3: Security List and NSG Audit
```bash
# List all security lists in the VCN
oci network security-list list \
--compartment-id "$PROD_COMPARTMENT_OCID" \
--vcn-id "$VCN_OCID" \
--query 'data[].{name:"display-name", ingress:"ingress-security-rules[?source==\`0.0.0.0/0\`]"}' \
--output json
# FAIL if any rule allows 0.0.0.0/0 ingress on ports other than 80/443
oci network nsg rules list \
--network-security-group-id "$NSG_OCID" \
--query 'data[?source==`0.0.0.0/0` && "tcp-options"."destination-port-range".min!=`443`]' \
--output table
```
**Pass criteria:** No security list allows unrestricted ingress (`0.0.0.0/0`) except ports 80 and 443. Prefer NSGs over security lists for production workloads.
### Step 4: API Key Rotation Check
```python
import oci
from datetime import datetime, timezone, timedelta
config = oci.config.from_file("~/.oci/config")
identity = oci.identity.IdentityClient(config)
# List API keys for all users in the tenancy
users = identity.list_users(compartment_id=config["tenancy"]).data
max_age = timedelta(days=90)
now = datetime.now(timezone.utc)
for user in users:
keys = identity.list_api_keys(user_id=user.id).data
for key in keys:
age = now - key.time_created
status = "PASS" if age < max_age else "FAIL"
print(f" {status}: {user.name} — key {key.fingerprint} — {age.days} days old")
```
**Pass criteria:** No API key older than 90 days. Automated rotation via OCI Vault recommended.
### Step 5: Boot Volume Encryption
```bash
# Check that all boot volumes use customer-managed keys (not Oracle-managed)
oci bv boot-volume list \
--compartment-id "$PROD_COMPARTMENT_OCID" \
--query 'data[].{name:"display-name", kms:"kms-key-id"}' \
--output table
# FAIL if kms-key-id is null (Oracle-managed default encryption)
```
**Pass criteria:** All boot volumes encrypted with customer-managed keys from OCI Vault. Oracle-managed encryption is the default but does not meet most compliance frameworks (SOC 2, PCI-DSS).
### Step 6: OS Management Agent Verification
```bash
# Check if instance agent plugins are enabled
oci instance-agent plugin list \
--instanceagent-id "$INSTANCE_OCID" \
--compartment-id "$PROD_COMPARTMENT_OCID" \
--query 'data[].{name:name, status:status}' \
--output table
# Required plugins: Vulnerability Scanning, OS Management Service Agent, Compute Instance Run Command
```
**Pass criteria:** OS Management Service Agent, Vulnerability Scanning, and Run Command plugins are all `RUNNING`.
### Step 7: Cloud Guard Status
```bash
# Verify Cloud Guard is enabled and detector recipes are active
oci cloud-guard target list \
--compartment-id "$PROD_COMPARTMENT_OCID" \
--query 'data.items[].{name:"display-name", state:"lifecycle-state"}' \
--output table
# Check for open problems
oci cloud-guard problem list \
--compartment-id "$PROD_COMPARTMENT_OCID" \
--lifecycle-state "ACTIVE" \
--query 'data.items[].{label:"resource-name", risk:"risk-level", detail:"additional-details"}' \
--output table
```
**Pass criteria:** Cloud Guard target is ACTIVE with Oracle-managed detector recipes. Zero CRITICAL or HIGH risk problems.
### Step 8: Vulnerability Scanning
```bash
# List scan recipes and recent results
oci vulnerability-scanning host scan recipe list \
--compartment-id "$PROD_COMPARTMENT_OCID" \
--output table
oci vulnerability-scanning host vulnerability list \
--compartment-id "$PROD_COMPARTMENT_OCID" \
--query 'data.items[?severity==`CRITICAL`].{name:name, severity:severity, cve:"cve-reference"}' \
--output table
```
**Pass criteria:** Scan recipes assigned to all compute instances. Zero CRITICAL vulnerabilities.
## Output
Successful completion produces:
- An 8-point pass/fail checklist covering compartment isolation, backups, security rules, key rotation, encryption, OS agents, Cloud Guard, and vulnerability scanning
- Specific FAIL findings with remediation commands for each item
- A clear go/no-go decision for production deployment
## Error Handling
| Error | Code | Cause | Solution |
|-------|------|-------|----------|
| NotAuthorizedOrNotFound | 404 | Insufficient IAM policies for audit | Add `allow group auditors to inspect all-resources in compartment prod` |
| NotAuthenticated | 401 | API key expired or misconfigured | Rotate key per Step 4 and update `~/.oci/config` |
| Cloud Guard not enabled | — | Cloud Guard never activated at tenancy level | Enable via Console: Administration > Cloud Guard > Enable |
| TooManyRequests | 429 | Rate limited when scanning all compartments | Add 1-second delay between API calls — no Retry-After header from OCI |
| InternalError | 500 | OCI service issue | Retry after 60 seconds; check https://ocistatus.oraclecloud.com |
| Vulnerability Scanning not available | — | Not enabled for the region/compartment | Enable: Console > Security > Vulnerability Scanning > Create Recipe |
## Examples
**Quick pre-flight check (CLI one-liners):**
```bash
# Check compartment isolation
oci iam compartment get --compartment-id "$PROD_COMPARTMENT_OCID" \
--query 'data.name' --raw-output
# Count boot volumes without backup policies
oci bv boot-volume list --compartment-id "$PROD_COMPARTMENT_OCID" \
--query 'length(data[?!"backup-policy-id"])' --raw-output
# Count open Cloud Guard problems
Related in Cloud & DevOps
appbuilder-action-scaffolder
IncludedCreate, implement, deploy, and debug Adobe Runtime actions with consistent layout, validation, and error handling. Use this skill whenever the user needs to add actions to an App Builder project, understand action structure (params, response format, web/raw actions), configure actions in the manifest, use App Builder SDKs (State, Files, Events, database), deploy and invoke actions via CLI, debug action issues, or implement patterns such as webhook receivers, custom event providers, journaling consumers, large payload redirects, action sequence pipelines, and Asset Compute workers. Also trigger when users mention serverless functions in Adobe context, action logging, IMS authentication for actions, or cron-style scheduled actions.
orchestrating-datacloud
IncludedSalesforce Data Cloud product orchestrator for connect→prepare→harmonize→segment→act workflows. Use this skill when the user needs a multi-step Data Cloud pipeline, cross-phase troubleshooting, or data space and data kit management. TRIGGER when: user needs a multi-step Data Cloud pipeline, asks to set up or troubleshoot Data Cloud across phases, manages data spaces or data kits, or wants a cross-phase sf data360 workflow. DO NOT TRIGGER when: work is isolated to a single phase (use the matching phase-specific skill), the task is STDM/session tracing/parquet telemetry (use observing-agentforce), standard CRM SOQL (use querying-soql), or Apex implementation (use generating-apex).
github-project-automation
IncludedAutomate GitHub repository setup with CI/CD workflows, issue templates, Dependabot, and CodeQL security scanning. Includes 12 production-tested workflows and prevents 18 errors: YAML syntax, action pinning, and configuration. Use when: setting up GitHub Actions CI/CD, creating issue/PR templates, enabling Dependabot or CodeQL scanning, deploying to Cloudflare Workers, implementing matrix testing, or troubleshooting YAML indentation, action version pinning, secrets syntax, runner versions, or CodeQL configuration. Keywords: github actions, github workflow, ci/cd, issue templates, pull request templates, dependabot, codeql, security scanning, yaml syntax, github automation, repository setup, workflow templates, github actions matrix, secrets management, branch protection, codeowners, github projects, continuous integration, continuous deployment, workflow syntax error, action version pinning, runner version, github context, yaml indentation error
sf-datacloud
IncludedSalesforce Data Cloud product orchestrator for connect→prepare→harmonize→segment→act workflows. TRIGGER when: user needs a multi-step Data Cloud pipeline, asks to set up or troubleshoot Data Cloud across phases, manages data spaces or data kits, or wants a cross-phase `sf data360` workflow. DO NOT TRIGGER when: work is isolated to a single phase (use the matching sf-datacloud-* skill), the task is STDM/session tracing/parquet telemetry (use sf-ai-agentforce-observability), standard CRM SOQL (use sf-soql), or Apex implementation (use sf-apex).
fabric-cli
IncludedUse this skill for Fabric.so CLI workflows with the `fabric` terminal command: diagnose/install/login, search or browse a Fabric library, save notes/links/files, create folders, ask the Fabric AI assistant, manage tasks/workspaces, generate shell completion, check subscription usage, produce JSON output, and use Fabric as persistent agent memory. Do not use for Microsoft Fabric/Azure/Power BI `fab`, Daniel Miessler's Fabric framework, Python Fabric SSH, Fabric.js, or textile/fashion fabric.
lark
IncludedLark/Feishu CLI skills: lark-cli operations for docs, markdown, sheets, base, calendar, im, mail, task, okr, drive, wiki, slides, whiteboard, apps, approval, attendance, contact, vc, minutes, event. Use when the user needs to operate Lark/Feishu resources via lark-cli, send messages, manage documents, spreadsheets, calendars, tasks, OKRs, deploy web pages, or any Feishu/Lark workspace operations.