embedded-ai-deployment

Included with Lifetime

$97 forever

Deploy AI models to embedded hardware using MathWorks tools (MATLAB, Simulink, Embedded Coder). Covers two workflow patterns: (1) MathWorks-native or 3P-imported models rebuilt as dlnetwork for lean hardware (Cortex-M, DSP), (2) direct C/C++ code generation from PyTorch and LiteRT models for high-performance hardware (Cortex-A, x86, GPU). Trigger when: user wants to deploy AI to embedded targets; generate C/CUDA from neural networks; compress AI models for MCU/DSP; integrate AI in Simulink for system-level simulation; import PyTorch/ONNX/TensorFlow models for embedded deployment; optimize AI for resource-constrained hardware; or use loadPyTorchExportedProgram, importNetworkFromPyTorch, dlquantizer, exportNetworkToSimulink, or Embedded Coder with AI models.

Cloud & DevOps

What this skill does


# Embedded AI for Engineered Systems

Deploy AI models to embedded hardware using MATLAB&reg; and Simulink&reg;. This skill is
written specifically for **MATLAB R2026a** and uses APIs, functions, and workflows
introduced in that release. It covers the complete lifecycle: model creation or
import, verification, compression, system-level simulation, and code generation
for resource-constrained targets.

Requires MATLAB R2026a or newer. Core toolboxes: Deep Learning Toolbox, Statistics
and Machine Learning Toolbox, MATLAB Coder, Embedded Coder, Simulink, and
Fixed-Point Designer. Workflow-specific support packages are checked during
Environment Discovery. The MATLAB and Simulink Agentic Toolkits must be available
so Codex can drive a live MATLAB and Simulink session through MCP tools.

## Workflow Pattern Selection

Determine the correct workflow pattern based on model origin and deployment target.

### Decision Tree

Primary discriminator for 3P models: **model size + hardware class**.

```
Q1: What is the deployment target?
 |
 +-- Cortex-M (M33, M4, M7) ---------------------> Q2
 +-- Cortex-A/R processor or DSP (C2000, etc.) ----> Q2
 +-- x86 processor or GPU (Jetson, CUDA) ----------> Q2
      |
      Q2: Where does the AI model come from?
       |
       +-- Train from scratch in MATLAB ------------> Pattern 1  (references/pattern1/workflow.md)
       +-- Pre-trained 3P model --------------------> Q3
            |
            Q3: Route by hardware class + model size
             |
             +-- Cortex-M: always Pattern 1 import
             |     (MathWorks compression, tight sim-codegen agreement)
             |
             +-- x86 / GPU: Pattern 2 if PyTorch or LiteRT
             |     Pattern 1 import if ONNX/TF (convert to Py/LiteRT recommended)
             |
             +-- Cortex-A/R or DSP:
                   +-- Small model (< 500 KB) ---------> Pattern 1 with import path
                   +-- Large model (> 1 MB):
                        +-- PyTorch / LiteRT -----------> Pattern 2
                        +-- ONNX / TensorFlow ----------> Pattern 1 import *
```

\* Convert to PyTorch&reg; (.pt2) or LiteRT (.tflite) to use Pattern 2 instead.

### Pattern Summary

| Pattern | Model Origin | Target Hardware | Primary Toolchain |
|---------|-------------|-----------------|-------------------|
| **1** | MATLAB-native or 3P imported as dlnetwork | ARM&reg; Cortex&reg;-M (M33, M4, M7), Cortex-A/R, DSP | Embedded Coder&trade; |
| **2** | PyTorch (.pt2) or LiteRT (.tflite) direct code generation | Cortex-A/R, DSP, x86, GPU | MATLAB Coder&trade; + PyTorch & LiteRT SPKG |

### Pattern 1 vs Pattern 2 Capability Comparison

| Capability | Pattern 1 (dlnetwork) | Pattern 2 (PyTorch/LiteRT direct) |
|-----------|----------------------|----------------------|
| C code generation | Yes | Yes |
| Weight inspection / modification | **Yes** | No |
| dlquantizer (INT8) | **Yes** | No |
| Projection (compressNetworkUsingProjection) | **Yes** | No |
| Pruning | **Yes** | No |
| Simulink integration | **Yes** (exportNetworkToSimulink) | **Yes** (PyTorch SPKG Simulink blocks) |
| Fixed-point codegen | **Yes** | No |
| Combined compression (77%+ flash savings) | **Yes** | No |
| Speed to first C code | Slower | **Faster** |
| Requires native rebuild for 3P models | Yes | No |

**Rule of thumb:** Choose Pattern 1 for small models (< 500 KB) on lean hardware
(Cortex-M, DSP) where you need MathWorks compression and tight simulation-codegen
agreement. Choose Pattern 2 for larger models (> 1 MB) on high-performance hardware
(x86, GPU, Cortex-A) where simulation speed is a priority and compression is done
externally in Python. For Cortex-A/R and DSP targets, model size is the primary
discriminator. Pattern 2 supports PyTorch (.pt2) and LiteRT (.tflite) formats.
Both patterns support Simulink integration.

## Common Start: Prerequisites

Regardless of pattern, **always** begin with these two prerequisite steps before
entering the pattern-specific phases (which start at Phase 1):

1. **Environment Discovery** (silent): Load [`references/shared/environment-setup.md`](references/shared/environment-setup.md)
2. **Project Discovery** (interactive): Load [`references/shared/project-discovery.md`](references/shared/project-discovery.md)

Project Discovery determines the workflow pattern via the decision tree above.

## Banned Legacy Functions

| Legacy (BANNED) | Modern Replacement |
|-----------------|-------------------|
| `trainNetwork` / `trainnetwork` / `train` (for DL) | `trainnet` |
| `DAGNetwork` / `SeriesNetwork` / `network` | `dlnetwork` |
| `importONNXNetwork` / `importONNXLayers` | `importNetworkFromONNX` |
| `importTensorFlowNetwork` / `importKerasNetwork` | `importNetworkFromTensorFlow` |
| `importTensorFlowLayers` / `importKerasLayers` | `importNetworkFromTensorFlow` |
| `taylorPrunableNetwork` / `updateScore` / `updatePrunables` | `compressNetworkUsingTaylorPruning` |
| `csvread` / `xlsread` | `readmatrix` / `readtable` |
| `datenum` | `datetime` |

## Global Rules

### ALWAYS

- Check **toolboxes** via `detect_matlab_toolboxes` and **support packages** via `matlabshared.supportpkg.getInstalled` before any workflow step
- If a support package is missing, ask the user to download from Add-On Explorer -- **never** install on their behalf
- Guide the user step-by-step -- one phase at a time
- Use `rng("default")` before any data splitting
- Verify numerical equivalence at each transformation step
- Generate MEX for desktop validation before generating C code for target
- Use `arguments` blocks in all codegen-ready functions
- Use `single` precision for all inference inputs
- **Script-based execution:** For each workflow step done in MATLAB, create a `.m` script file and execute it with `run_matlab_file` or `evaluate_matlab_code`. Do NOT run ad-hoc MATLAB commands without first writing the script file. If a script needs changes, edit the script file and re-run it. This gives users full visibility into what code is being executed and enables reproducibility. **IMPORTANT:** `run_matlab_file` sets the working directory to the script's folder. Always use **absolute paths** (via `fullfile`) for model files, data, and saved outputs — never rely on `pwd` or relative paths.
- **Pause after each workflow step:** After every workflow step completes, pause and explicitly ask the user for permission to proceed to the next step. The goal is to let the user read/inspect the MATLAB scripts you created, review results, and ask questions before moving on.
- **Deep Network Designer:** When a model is trained in MATLAB, imported, or rebuilt as a native dlnetwork, load it in Deep Network Designer (`deepNetworkDesigner(net)`) so the user can visually inspect the architecture. Announce this action and wait for user acknowledgment before proceeding.
- **Numerical equivalency tests (import workflows):** For any import from PyTorch or ONNX:
  1. Run inference on the **original 3P model** (via bundled Python for PyTorch, or ONNX runtime) to collect ground-truth reference data. Do NOT use the imported MATLAB model as reference — its custom autogenerated layers may produce incorrect outputs.
  2. Run the same inputs through the **rebuilt native** MATLAB model and compare against ground truth
  3. After compression, report the accuracy delta vs. the uncompressed baseline (MAE, max error, % accuracy drop). Compute these from variables in the current run — never hardcode numeric values into `fprintf`/`disp` strings, because re-running the script with different inputs or a different model will then print stale numbers.
  4. Run tests to validate numerical equivalence between: compressed model in MATLAB, compressed model in Simulink, and final generated code
- **Test count proposal:** Before running numerical equivalency tests, propose how many tests you plan to run and explain why (considering model complexity, output range, class count, etc.). Wait for user agreement or correction before

Files: 23

Size: 245.3 KB

Complexity: 77/100

Category: Cloud & DevOps

Source: https://github.com/matlab/skills/tree/main/demos/embedded-ai-deployment/skills/embedded-ai-deployment

Related in Cloud & DevOps

appbuilder-action-scaffolder

Included

Create, implement, deploy, and debug Adobe Runtime actions with consistent layout, validation, and error handling. Use this skill whenever the user needs to add actions to an App Builder project, understand action structure (params, response format, web/raw actions), configure actions in the manifest, use App Builder SDKs (State, Files, Events, database), deploy and invoke actions via CLI, debug action issues, or implement patterns such as webhook receivers, custom event providers, journaling consumers, large payload redirects, action sequence pipelines, and Asset Compute workers. Also trigger when users mention serverless functions in Adobe context, action logging, IMS authentication for actions, or cron-style scheduled actions.

Cloud & DevOpsscripts

orchestrating-datacloud

Included

Salesforce Data Cloud product orchestrator for connect→prepare→harmonize→segment→act workflows. Use this skill when the user needs a multi-step Data Cloud pipeline, cross-phase troubleshooting, or data space and data kit management. TRIGGER when: user needs a multi-step Data Cloud pipeline, asks to set up or troubleshoot Data Cloud across phases, manages data spaces or data kits, or wants a cross-phase sf data360 workflow. DO NOT TRIGGER when: work is isolated to a single phase (use the matching phase-specific skill), the task is STDM/session tracing/parquet telemetry (use observing-agentforce), standard CRM SOQL (use querying-soql), or Apex implementation (use generating-apex).

Cloud & DevOpsscripts

github-project-automation

Included

Automate GitHub repository setup with CI/CD workflows, issue templates, Dependabot, and CodeQL security scanning. Includes 12 production-tested workflows and prevents 18 errors: YAML syntax, action pinning, and configuration. Use when: setting up GitHub Actions CI/CD, creating issue/PR templates, enabling Dependabot or CodeQL scanning, deploying to Cloudflare Workers, implementing matrix testing, or troubleshooting YAML indentation, action version pinning, secrets syntax, runner versions, or CodeQL configuration. Keywords: github actions, github workflow, ci/cd, issue templates, pull request templates, dependabot, codeql, security scanning, yaml syntax, github automation, repository setup, workflow templates, github actions matrix, secrets management, branch protection, codeowners, github projects, continuous integration, continuous deployment, workflow syntax error, action version pinning, runner version, github context, yaml indentation error

Cloud & DevOpsscripts

sf-datacloud

Included

Salesforce Data Cloud product orchestrator for connect→prepare→harmonize→segment→act workflows. TRIGGER when: user needs a multi-step Data Cloud pipeline, asks to set up or troubleshoot Data Cloud across phases, manages data spaces or data kits, or wants a cross-phase `sf data360` workflow. DO NOT TRIGGER when: work is isolated to a single phase (use the matching sf-datacloud-* skill), the task is STDM/session tracing/parquet telemetry (use sf-ai-agentforce-observability), standard CRM SOQL (use sf-soql), or Apex implementation (use sf-apex).

Cloud & DevOpsscripts

fabric-cli

Included

Use this skill for Fabric.so CLI workflows with the `fabric` terminal command: diagnose/install/login, search or browse a Fabric library, save notes/links/files, create folders, ask the Fabric AI assistant, manage tasks/workspaces, generate shell completion, check subscription usage, produce JSON output, and use Fabric as persistent agent memory. Do not use for Microsoft Fabric/Azure/Power BI `fab`, Daniel Miessler's Fabric framework, Python Fabric SSH, Fabric.js, or textile/fashion fabric.

Cloud & DevOpsscripts

lark

Included

Lark/Feishu CLI skills: lark-cli operations for docs, markdown, sheets, base, calendar, im, mail, task, okr, drive, wiki, slides, whiteboard, apps, approval, attendance, contact, vc, minutes, event. Use when the user needs to operate Lark/Feishu resources via lark-cli, send messages, manage documents, spreadsheets, calendars, tasks, OKRs, deploy web pages, or any Feishu/Lark workspace operations.

Cloud & DevOpsscripts