mlops-workflows
Included with Lifetime
$97 forever
Comprehensive MLOps workflows for the complete ML lifecycle - experiment tracking, model registry, deployment patterns, monitoring, A/B testing, and production best practices with MLflow
Machine Learning Operationsmlopsmlflowexperiment-trackingmodel-registrydeploymentmonitoringml-lifecyclefeature-stores
What this skill does
# MLOps Workflows with MLflow
A comprehensive guide to production-grade MLOps workflows covering the complete machine learning lifecycle from experimentation to production deployment and monitoring.
## Table of Contents
1. [MLflow Components Overview](#mlflow-components-overview)
2. [Experiment Tracking](#experiment-tracking)
3. [Model Registry](#model-registry)
4. [Deployment Patterns](#deployment-patterns)
5. [Monitoring and Observability](#monitoring-and-observability)
6. [A/B Testing](#ab-testing)
7. [Feature Stores](#feature-stores)
8. [CI/CD for ML](#cicd-for-ml)
9. [Model Versioning](#model-versioning)
10. [Production Best Practices](#production-best-practices)
## MLflow Components Overview
MLflow consists of four primary components for managing the ML lifecycle:
### 1. MLflow Tracking
Track experiments, parameters, metrics, and artifacts during model development.
```python
import mlflow
# Set tracking URI
mlflow.set_tracking_uri("http://localhost:5000")
# Create or set experiment
mlflow.set_experiment("production-models")
# Start a run
with mlflow.start_run(run_name="baseline-model"):
# Log parameters
mlflow.log_param("learning_rate", 0.01)
mlflow.log_param("batch_size", 32)
# Log metrics
mlflow.log_metric("accuracy", 0.95)
mlflow.log_metric("loss", 0.05)
# Log artifacts
mlflow.log_artifact("model_plot.png")
```
### 2. MLflow Projects
Package ML code in a reusable, reproducible format.
```yaml
# MLproject file
name: my-ml-project
conda_env: conda.yaml
entry_points:
main:
parameters:
learning_rate: {type: float, default: 0.01}
epochs: {type: int, default: 100}
command: "python train.py --lr {learning_rate} --epochs {epochs}"
evaluate:
parameters:
model_uri: {type: string}
command: "python evaluate.py --model-uri {model_uri}"
```
### 3. MLflow Models
Package models in a standard format for deployment across platforms.
```python
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
# Train model
model = RandomForestClassifier()
model.fit(X_train, y_train)
# Log model with signature
from mlflow.models import infer_signature
signature = infer_signature(X_train, model.predict(X_train))
mlflow.sklearn.log_model(
sk_model=model,
name="random-forest-model",
signature=signature,
input_example=X_train[:5],
registered_model_name="ProductionClassifier"
)
```
### 4. MLflow Registry
Centralized model store for managing model lifecycle and versioning.
```python
from mlflow import MlflowClient
client = MlflowClient()
# Register model
model_uri = f"runs:/{run_id}/model"
registered_model = mlflow.register_model(
model_uri=model_uri,
name="CustomerChurnModel"
)
# Set model alias for deployment
client.set_registered_model_alias(
name="CustomerChurnModel",
alias="production",
version=registered_model.version
)
```
## Experiment Tracking
### Basic Experiment Tracking
```python
import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.model_selection import train_test_split
# Configure MLflow
mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("house-price-prediction")
# Load and prepare data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# Training with MLflow tracking
with mlflow.start_run(run_name="rf-baseline"):
# Define parameters
params = {
"n_estimators": 100,
"max_depth": 10,
"min_samples_split": 5,
"random_state": 42
}
# Train model
model = RandomForestRegressor(**params)
model.fit(X_train, y_train)
# Evaluate
predictions = model.predict(X_test)
mse = mean_squared_error(y_test, predictions)
r2 = r2_score(y_test, predictions)
# Log everything
mlflow.log_params(params)
mlflow.log_metrics({
"mse": mse,
"r2": r2,
"rmse": mse ** 0.5
})
# Log model
mlflow.sklearn.log_model(
sk_model=model,
name="model",
registered_model_name="HousePricePredictor"
)
```
### Autologging
MLflow provides automatic logging for popular frameworks:
```python
import mlflow
from sklearn.ensemble import RandomForestClassifier
# Enable autologging for scikit-learn
mlflow.sklearn.autolog()
# Your training code - everything is logged automatically
with mlflow.start_run():
model = RandomForestClassifier(n_estimators=100, max_depth=5)
model.fit(X_train, y_train)
predictions = model.predict(X_test)
```
### Nested Runs for Hyperparameter Tuning
```python
import mlflow
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import GradientBoostingClassifier
mlflow.set_experiment("hyperparameter-tuning")
# Parent run for the entire tuning process
with mlflow.start_run(run_name="grid-search-parent"):
param_grid = {
'learning_rate': [0.01, 0.1, 0.3],
'n_estimators': [50, 100, 200],
'max_depth': [3, 5, 7]
}
# Log parent parameters
mlflow.log_param("tuning_method", "grid_search")
mlflow.log_param("cv_folds", 5)
best_score = 0
best_params = None
# Nested runs for each parameter combination
for lr in param_grid['learning_rate']:
for n_est in param_grid['n_estimators']:
for depth in param_grid['max_depth']:
with mlflow.start_run(nested=True, run_name=f"lr{lr}_n{n_est}_d{depth}"):
params = {
'learning_rate': lr,
'n_estimators': n_est,
'max_depth': depth
}
model = GradientBoostingClassifier(**params)
model.fit(X_train, y_train)
score = model.score(X_test, y_test)
mlflow.log_params(params)
mlflow.log_metric("accuracy", score)
if score > best_score:
best_score = score
best_params = params
# Log best results in parent run
mlflow.log_params({f"best_{k}": v for k, v in best_params.items()})
mlflow.log_metric("best_accuracy", best_score)
```
### Tracking Multiple Metrics Over Time
```python
import mlflow
import numpy as np
with mlflow.start_run():
# Log metrics at different steps (epochs)
for epoch in range(100):
train_loss = np.random.random() * (1 - epoch/100)
val_loss = np.random.random() * (1 - epoch/100) + 0.1
mlflow.log_metric("train_loss", train_loss, step=epoch)
mlflow.log_metric("val_loss", val_loss, step=epoch)
mlflow.log_metric("learning_rate", 0.01 * (0.95 ** epoch), step=epoch)
```
### Logging Artifacts
```python
import mlflow
import matplotlib.pyplot as plt
import pandas as pd
with mlflow.start_run():
# Log plot
plt.figure(figsize=(10, 6))
plt.plot(history['loss'], label='Training Loss')
plt.plot(history['val_loss'], label='Validation Loss')
plt.legend()
plt.savefig("loss_curve.png")
mlflow.log_artifact("loss_curve.png")
# Log dataframe as CSV
feature_importance = pd.DataFrame({
'feature': feature_names,
'importance': model.feature_importances_
})
feature_importance.to_csv("feature_importance.csv", index=False)
mlflow.log_artifact("feature_importance.csv")
# Log entire directory
mlflow.log_artifacts("output_dir/", artifact_path="outputs")
```
## Model Registry
### Registering Models
```python
from mlflow import MlflowClient
import mlflow.sklearn
client = MlflowClient()
# Method 1: Register during model logging
with mlflow.start_run():
mlflow.sklearn.log_model(
sk_model=model,
name="model",
registered_model_name="CustomerSegmentationModel"
)
# Method 2: Register an existing model
run_id = "abc123"
model_uri = f"runs:/{run_id}/model"
registered_model = mlflow