Claude
Skills
Sign in
Back

mlops-workflows

Included with Lifetime
$97 forever

Comprehensive MLOps workflows for the complete ML lifecycle - experiment tracking, model registry, deployment patterns, monitoring, A/B testing, and production best practices with MLflow

Machine Learning Operationsmlopsmlflowexperiment-trackingmodel-registrydeploymentmonitoringml-lifecyclefeature-stores

What this skill does


# MLOps Workflows with MLflow

A comprehensive guide to production-grade MLOps workflows covering the complete machine learning lifecycle from experimentation to production deployment and monitoring.

## Table of Contents

1. [MLflow Components Overview](#mlflow-components-overview)
2. [Experiment Tracking](#experiment-tracking)
3. [Model Registry](#model-registry)
4. [Deployment Patterns](#deployment-patterns)
5. [Monitoring and Observability](#monitoring-and-observability)
6. [A/B Testing](#ab-testing)
7. [Feature Stores](#feature-stores)
8. [CI/CD for ML](#cicd-for-ml)
9. [Model Versioning](#model-versioning)
10. [Production Best Practices](#production-best-practices)

## MLflow Components Overview

MLflow consists of four primary components for managing the ML lifecycle:

### 1. MLflow Tracking

Track experiments, parameters, metrics, and artifacts during model development.

```python
import mlflow

# Set tracking URI
mlflow.set_tracking_uri("http://localhost:5000")

# Create or set experiment
mlflow.set_experiment("production-models")

# Start a run
with mlflow.start_run(run_name="baseline-model"):
    # Log parameters
    mlflow.log_param("learning_rate", 0.01)
    mlflow.log_param("batch_size", 32)

    # Log metrics
    mlflow.log_metric("accuracy", 0.95)
    mlflow.log_metric("loss", 0.05)

    # Log artifacts
    mlflow.log_artifact("model_plot.png")
```

### 2. MLflow Projects

Package ML code in a reusable, reproducible format.

```yaml
# MLproject file
name: my-ml-project
conda_env: conda.yaml

entry_points:
  main:
    parameters:
      learning_rate: {type: float, default: 0.01}
      epochs: {type: int, default: 100}
    command: "python train.py --lr {learning_rate} --epochs {epochs}"

  evaluate:
    parameters:
      model_uri: {type: string}
    command: "python evaluate.py --model-uri {model_uri}"
```

### 3. MLflow Models

Package models in a standard format for deployment across platforms.

```python
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier

# Train model
model = RandomForestClassifier()
model.fit(X_train, y_train)

# Log model with signature
from mlflow.models import infer_signature
signature = infer_signature(X_train, model.predict(X_train))

mlflow.sklearn.log_model(
    sk_model=model,
    name="random-forest-model",
    signature=signature,
    input_example=X_train[:5],
    registered_model_name="ProductionClassifier"
)
```

### 4. MLflow Registry

Centralized model store for managing model lifecycle and versioning.

```python
from mlflow import MlflowClient

client = MlflowClient()

# Register model
model_uri = f"runs:/{run_id}/model"
registered_model = mlflow.register_model(
    model_uri=model_uri,
    name="CustomerChurnModel"
)

# Set model alias for deployment
client.set_registered_model_alias(
    name="CustomerChurnModel",
    alias="production",
    version=registered_model.version
)
```

## Experiment Tracking

### Basic Experiment Tracking

```python
import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.model_selection import train_test_split

# Configure MLflow
mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("house-price-prediction")

# Load and prepare data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Training with MLflow tracking
with mlflow.start_run(run_name="rf-baseline"):
    # Define parameters
    params = {
        "n_estimators": 100,
        "max_depth": 10,
        "min_samples_split": 5,
        "random_state": 42
    }

    # Train model
    model = RandomForestRegressor(**params)
    model.fit(X_train, y_train)

    # Evaluate
    predictions = model.predict(X_test)
    mse = mean_squared_error(y_test, predictions)
    r2 = r2_score(y_test, predictions)

    # Log everything
    mlflow.log_params(params)
    mlflow.log_metrics({
        "mse": mse,
        "r2": r2,
        "rmse": mse ** 0.5
    })

    # Log model
    mlflow.sklearn.log_model(
        sk_model=model,
        name="model",
        registered_model_name="HousePricePredictor"
    )
```

### Autologging

MLflow provides automatic logging for popular frameworks:

```python
import mlflow
from sklearn.ensemble import RandomForestClassifier

# Enable autologging for scikit-learn
mlflow.sklearn.autolog()

# Your training code - everything is logged automatically
with mlflow.start_run():
    model = RandomForestClassifier(n_estimators=100, max_depth=5)
    model.fit(X_train, y_train)
    predictions = model.predict(X_test)
```

### Nested Runs for Hyperparameter Tuning

```python
import mlflow
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import GradientBoostingClassifier

mlflow.set_experiment("hyperparameter-tuning")

# Parent run for the entire tuning process
with mlflow.start_run(run_name="grid-search-parent"):
    param_grid = {
        'learning_rate': [0.01, 0.1, 0.3],
        'n_estimators': [50, 100, 200],
        'max_depth': [3, 5, 7]
    }

    # Log parent parameters
    mlflow.log_param("tuning_method", "grid_search")
    mlflow.log_param("cv_folds", 5)

    best_score = 0
    best_params = None

    # Nested runs for each parameter combination
    for lr in param_grid['learning_rate']:
        for n_est in param_grid['n_estimators']:
            for depth in param_grid['max_depth']:
                with mlflow.start_run(nested=True, run_name=f"lr{lr}_n{n_est}_d{depth}"):
                    params = {
                        'learning_rate': lr,
                        'n_estimators': n_est,
                        'max_depth': depth
                    }

                    model = GradientBoostingClassifier(**params)
                    model.fit(X_train, y_train)
                    score = model.score(X_test, y_test)

                    mlflow.log_params(params)
                    mlflow.log_metric("accuracy", score)

                    if score > best_score:
                        best_score = score
                        best_params = params

    # Log best results in parent run
    mlflow.log_params({f"best_{k}": v for k, v in best_params.items()})
    mlflow.log_metric("best_accuracy", best_score)
```

### Tracking Multiple Metrics Over Time

```python
import mlflow
import numpy as np

with mlflow.start_run():
    # Log metrics at different steps (epochs)
    for epoch in range(100):
        train_loss = np.random.random() * (1 - epoch/100)
        val_loss = np.random.random() * (1 - epoch/100) + 0.1

        mlflow.log_metric("train_loss", train_loss, step=epoch)
        mlflow.log_metric("val_loss", val_loss, step=epoch)
        mlflow.log_metric("learning_rate", 0.01 * (0.95 ** epoch), step=epoch)
```

### Logging Artifacts

```python
import mlflow
import matplotlib.pyplot as plt
import pandas as pd

with mlflow.start_run():
    # Log plot
    plt.figure(figsize=(10, 6))
    plt.plot(history['loss'], label='Training Loss')
    plt.plot(history['val_loss'], label='Validation Loss')
    plt.legend()
    plt.savefig("loss_curve.png")
    mlflow.log_artifact("loss_curve.png")

    # Log dataframe as CSV
    feature_importance = pd.DataFrame({
        'feature': feature_names,
        'importance': model.feature_importances_
    })
    feature_importance.to_csv("feature_importance.csv", index=False)
    mlflow.log_artifact("feature_importance.csv")

    # Log entire directory
    mlflow.log_artifacts("output_dir/", artifact_path="outputs")
```

## Model Registry

### Registering Models

```python
from mlflow import MlflowClient
import mlflow.sklearn

client = MlflowClient()

# Method 1: Register during model logging
with mlflow.start_run():
    mlflow.sklearn.log_model(
        sk_model=model,
        name="model",
        registered_model_name="CustomerSegmentationModel"
    )

# Method 2: Register an existing model
run_id = "abc123"
model_uri = f"runs:/{run_id}/model"
registered_model = mlflow