Claude
Skills
Sign in
Back

publish-models

Included with Lifetime
$97 forever

Push and publish custom AI models to Replicate, and set up CI/CD for releasing new model versions safely. Use when running cog push, deploying a model to Replicate, releasing a new version, validating a model with cog-safe-push before publishing, configuring a Replicate deployment, setting up GitHub Actions for model releases, or porting a community model to an official one. Trigger on phrases like "push a model to Replicate", "publish a model", "deploy a model", "release a new version", "cog push", "cog-safe-push", "model CI", "r8.im", or "schema compatibility", and when referencing github.com/replicate/cog-safe-push or github.com/replicate/model-ci-template. Covers cog push, the full cog-safe-push config (test cases, fuzz, deployment, official_model), GitHub Actions patterns, multi-model matrix pushes, and post-publish monitoring. Assumes you already have a working Cog project; see build-models if you need to package one first.

Cloud & DevOps

What this skill does


## Docs

- Cog reference: <https://cog.run/llms.txt>
- `cog push` reference: <https://cog.run/cli#cog-push>
- cog-safe-push: <https://github.com/replicate/cog-safe-push>
- Model CI template: <https://github.com/replicate/model-ci-template>
- Continuous deployment guide: <https://replicate.com/docs/guides/continuous-model-deployment>

## When to use this skill

- You have a working Cog project (see `build-models` if you don't yet).
- You want to publish a private or public model on Replicate.
- You're releasing a new version of an existing model and want to avoid breaking changes.
- You're setting up CI/CD for model releases.

## Prerequisites

- Cog installed and `cog login` against `r8.im` (or `echo $TOKEN | cog login --token-stdin`).
- A model created at `replicate.com/{owner}/{name}` via the API, web UI, or `r8-model` CLI.
- `REPLICATE_API_TOKEN` set in your environment.

## Plain `cog push`

The simplest path. Build and upload a new version:

```
cog push r8.im/owner/my-model
```

Or set `image: r8.im/owner/my-model` in `cog.yaml` and run a bare:

```
cog push
```

Useful flags:

- `--separate-weights` — store weights in a separate layer; faster cold boots and pushes for models with > 1GB of weights.
- `--x-fast` — faster pushes during iteration (skips some validation).
- `--secret id=hf,src=$HOME/.hf_token` — pass build-time secrets without baking them into image history.

## cog-safe-push (recommended for any model with users)

`cog-safe-push` pushes to a private `-test` model first, checks schema compatibility against the live version, runs prediction comparisons, and fuzzes inputs. Catches breaking changes before they reach users.

Install:

```
pip install git+https://github.com/replicate/cog-safe-push.git
```

Required env vars:

- `REPLICATE_API_TOKEN`
- `ANTHROPIC_API_KEY` (Claude judges output similarity for stochastic models)

Basic usage:

```
cog-safe-push --test-hardware=gpu-l40s owner/my-model
```

This will:

1. Lint `predict.py` with ruff.
2. Create a private test model `owner/my-model-test` if missing.
3. Push the local Cog model to the test model.
4. Lint the schema (descriptions, defaults, etc.).
5. Check schema compatibility against the live `owner/my-model` version.
6. Run prediction comparisons between live and test versions.
7. Fuzz the test model with AI-generated inputs.
8. If everything passes, push to `owner/my-model`.

## cog-safe-push.yaml schema

Drop a `cog-safe-push.yaml` in your project root (or `cog-safe-push-configs/<variant>.yaml` for multi-model repos). All five test-case checker types in one example:

```yaml
model: owner/my-model
test_model: owner/my-model-test
test_hardware: gpu-l40s

predict:
  compare_outputs: false              # set false for stochastic models
  predict_timeout: 600
  test_cases:
    - inputs:
        prompt: "a serene mountain landscape"
      match_prompt: "a landscape photo of mountains"   # AI-judged via Claude
    - inputs:
        prompt: "a cat"
      match_url: "https://example.com/reference-cat.png"   # binary/image match
    - inputs:
        prompt: ""
      error_contains: "prompt cannot be empty"           # negative test
    - inputs:
        mode: "json"
      jq_query: '.confidence > 0.8 and .status == "success"'   # JSON output
    - inputs:
        prompt: "echo this"
      exact_string: "echo this"                          # exact string match
  fuzz:
    fixed_inputs:
      seed: 42
    disabled_inputs:
      - debug
    iterations: 10
    prompt: "Generate creative and diverse prompts"

train:                                  # if your model has a trainer
  destination: owner/my-model-trained
  destination_hardware: gpu-l40s
  train_timeout: 1800
  test_cases:
    - inputs:
        input_images: "https://.../training.zip"
        steps: 10

deployment:                             # auto-create or update on push
  name: my-model
  owner: owner
  hardware: gpu-l40s

parallel: 4
fast_push: false
ignore_schema_compatibility: false
official_model: owner/my-model         # for proxy/wrapper models, see below
```

Test case checkers are mutually exclusive: pick exactly one of `match_prompt`, `match_url`, `error_contains`, `jq_query`, or `exact_string` per case. Use `compare_outputs: false` for any stochastic model (diffusion, LLMs); the default `true` is brittle.

## CI/CD: GitHub Actions

Two paths, depending on how much glue you want.

### Path A: roll your own

```yaml
# .github/workflows/push.yaml
name: Push to Replicate
on:
  workflow_dispatch:
    inputs:
      no_push:
        type: boolean
        default: false

jobs:
  push:
    runs-on: ubuntu-latest-4-cores       # builds need disk + cores
    steps:
      - uses: actions/checkout@v4
      - uses: jlumbroso/[email protected]
        with:
          tool-cache: false
          docker-images: false
      - uses: replicate/setup-cog@v2
        with:
          token: ${{ secrets.REPLICATE_API_TOKEN }}
      - run: pip install git+https://github.com/replicate/cog-safe-push.git
      - env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          REPLICATE_API_TOKEN: ${{ secrets.REPLICATE_API_TOKEN }}
        run: |
          cog-safe-push -vv ${{ inputs.no_push && '--no-push' || '' }}
```

Add a `concurrency:` block so PR builds cancel each other while main-branch pushes queue:

```yaml
concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
```

### Path B: reusable workflow from model-ci-template

For Replicate-style multi-model repos, drop in:

```yaml
# .github/workflows/ci.yaml
name: CI
on:
  pull_request: { branches: [main] }
  push: { branches: [main] }
  workflow_dispatch:
    inputs:
      models: { type: string, default: "all" }
      ignore_schema_checks: { type: boolean, default: false }
      cog_version: { type: string, default: "latest" }
      test_only: { type: boolean, default: false }

jobs:
  ci:
    uses: replicate/model-ci-template/.github/workflows/template.yaml@main
    with:
      trigger_type: ${{ github.event_name }}
      models: ${{ inputs.models || 'all' }}
      ignore_schema_checks: ${{ inputs.ignore_schema_checks || false }}
      cog_version: ${{ inputs.cog_version || 'latest' }}
      test_only: ${{ inputs.test_only || false }}
    secrets: inherit
```

The reusable workflow expects:

- `cog-safe-push-configs/<model>.yaml` — one per model variant.
- `script/select-model` — bash file with `if/elif [[ "$MODEL" == "..." ]]` blocks listing valid model names.
- Secrets: `COG_TOKEN`, `REPLICATE_API_TOKEN`, `ANTHROPIC_API_KEY`.

## Multi-model matrix pushes

Pattern from `replicate/cog-flux`: one repo, N variants, push them in parallel.

```yaml
jobs:
  prepare:
    runs-on: ubuntu-latest
    outputs:
      matrix: ${{ steps.set.outputs.matrix }}
    steps:
      - id: set
        run: |
          if [ "${{ inputs.models }}" = "all" ]; then
            echo 'matrix={"model":["schnell","dev","krea-dev"]}' >> "$GITHUB_OUTPUT"
          else
            list=$(echo "${{ inputs.models }}" | jq -Rc 'split(",")')
            echo "matrix={\"model\":$list}" >> "$GITHUB_OUTPUT"
          fi

  push:
    needs: prepare
    runs-on: ubuntu-latest-4-cores
    strategy:
      fail-fast: false
      matrix: ${{ fromJson(needs.prepare.outputs.matrix) }}
    steps:
      - uses: actions/checkout@v4
      - run: ./script/select.sh ${{ matrix.model }}     # produces cog.yaml from a template
      - run: cog-safe-push --config cog-safe-push-configs/${{ matrix.model }}.yaml -vv
```

## Two-pass push for proxy / official models

When you maintain a proxy that wraps a third-party API, you push to a private wrapper first, then update the public-facing official model card. Pattern from `replicate/cog-official-template`:

```bash
./script/write-api-key                                              # bake API key into config
cog-safe-push --config cog-safe-push-configs/${MODEL}.yaml -vv

./script/delete-api-key              
Files: 1
Size: 11.3 KB
Complexity: 21/100
Category: Cloud & DevOps

Related in Cloud & DevOps