workflow-orchestration-patterns
Master workflow orchestration architecture with Temporal, covering fundamental design decisions, resilience patterns, and best practices for building reliable distributed systems.
What this skill does
# Workflow Orchestration Patterns Master workflow orchestration architecture with Temporal, covering fundamental design decisions, resilience patterns, and best practices for building reliable distributed systems. ## Use this skill when - Working on workflow orchestration patterns tasks or workflows - Needing guidance, best practices, or checklists for workflow orchestration patterns ## Do not use this skill when - The task is unrelated to workflow orchestration patterns - You need a different domain or tool outside this scope ## Instructions - Clarify goals, constraints, and required inputs. - Apply relevant best practices and validate outcomes. - Provide actionable steps and verification. - If detailed examples are required, open `resources/implementation-playbook.md`. ## When to Use Workflow Orchestration ### Ideal Use Cases (Source: docs.temporal.io) - **Multi-step processes** spanning machines/services/databases - **Distributed transactions** requiring all-or-nothing semantics - **Long-running workflows** (hours to years) with automatic state persistence - **Failure recovery** that must resume from last successful step - **Business processes**: bookings, orders, campaigns, approvals - **Entity lifecycle management**: inventory tracking, account management, cart workflows - **Infrastructure automation**: CI/CD pipelines, provisioning, deployments - **Human-in-the-loop** systems requiring timeouts and escalations ### When NOT to Use - Simple CRUD operations (use direct API calls) - Pure data processing pipelines (use Airflow, batch processing) - Stateless request/response (use standard APIs) - Real-time streaming (use Kafka, event processors) ## Critical Design Decision: Workflows vs Activities **The Fundamental Rule** (Source: temporal.io/blog/workflow-engine-principles): - **Workflows** = Orchestration logic and decision-making - **Activities** = External interactions (APIs, databases, network calls) ### Workflows (Orchestration) **Characteristics:** - Contain business logic and coordination - **MUST be deterministic** (same inputs → same outputs) - **Cannot** perform direct external calls - State automatically preserved across failures - Can run for years despite infrastructure failures **Example workflow tasks:** - Decide which steps to execute - Handle compensation logic - Manage timeouts and retries - Coordinate child workflows ### Activities (External Interactions) **Characteristics:** - Handle all external system interactions - Can be non-deterministic (API calls, DB writes) - Include built-in timeouts and retry logic - **Must be idempotent** (calling N times = calling once) - Short-lived (seconds to minutes typically) **Example activity tasks:** - Call payment gateway API - Write to database - Send emails or notifications - Query external services ### Design Decision Framework ``` Does it touch external systems? → Activity Is it orchestration/decision logic? → Workflow ``` ## Core Workflow Patterns ### 1. Saga Pattern with Compensation **Purpose**: Implement distributed transactions with rollback capability **Pattern** (Source: temporal.io/blog/compensating-actions-part-of-a-complete-breakfast-with-sagas): ``` For each step: 1. Register compensation BEFORE executing 2. Execute the step (via activity) 3. On failure, run all compensations in reverse order (LIFO) ``` **Example: Payment Workflow** 1. Reserve inventory (compensation: release inventory) 2. Charge payment (compensation: refund payment) 3. Fulfill order (compensation: cancel fulfillment) **Critical Requirements:** - Compensations must be idempotent - Register compensation BEFORE executing step - Run compensations in reverse order - Handle partial failures gracefully ### 2. Entity Workflows (Actor Model) **Purpose**: Long-lived workflow representing single entity instance **Pattern** (Source: docs.temporal.io/evaluate/use-cases-design-patterns): - One workflow execution = one entity (cart, account, inventory item) - Workflow persists for entity lifetime - Receives signals for state changes - Supports queries for current state **Example Use Cases:** - Shopping cart (add items, checkout, expiration) - Bank account (deposits, withdrawals, balance checks) - Product inventory (stock updates, reservations) **Benefits:** - Encapsulates entity behavior - Guarantees consistency per entity - Natural event sourcing ### 3. Fan-Out/Fan-In (Parallel Execution) **Purpose**: Execute multiple tasks in parallel, aggregate results **Pattern:** - Spawn child workflows or parallel activities - Wait for all to complete - Aggregate results - Handle partial failures **Scaling Rule** (Source: temporal.io/blog/workflow-engine-principles): - Don't scale individual workflows - For 1M tasks: spawn 1K child workflows × 1K tasks each - Keep each workflow bounded ### 4. Async Callback Pattern **Purpose**: Wait for external event or human approval **Pattern:** - Workflow sends request and waits for signal - External system processes asynchronously - Sends signal to resume workflow - Workflow continues with response **Use Cases:** - Human approval workflows - Webhook callbacks - Long-running external processes ## State Management and Determinism ### Automatic State Preservation **How Temporal Works** (Source: docs.temporal.io/workflows): - Complete program state preserved automatically - Event History records every command and event - Seamless recovery from crashes - Applications restore pre-failure state ### Determinism Constraints **Workflows Execute as State Machines**: - Replay behavior must be consistent - Same inputs → identical outputs every time **Prohibited in Workflows** (Source: docs.temporal.io/workflows): - ❌ Threading, locks, synchronization primitives - ❌ Random number generation (`random()`) - ❌ Global state or static variables - ❌ System time (`datetime.now()`) - ❌ Direct file I/O or network calls - ❌ Non-deterministic libraries **Allowed in Workflows**: - ✅ `workflow.now()` (deterministic time) - ✅ `workflow.random()` (deterministic random) - ✅ Pure functions and calculations - ✅ Calling activities (non-deterministic operations) ### Versioning Strategies **Challenge**: Changing workflow code while old executions still running **Solutions**: 1. **Versioning API**: Use `workflow.get_version()` for safe changes 2. **New Workflow Type**: Create new workflow, route new executions to it 3. **Backward Compatibility**: Ensure old events replay correctly ## Resilience and Error Handling ### Retry Policies **Default Behavior**: Temporal retries activities forever **Configure Retry**: - Initial retry interval - Backoff coefficient (exponential backoff) - Maximum interval (cap retry delay) - Maximum attempts (eventually fail) **Non-Retryable Errors**: - Invalid input (validation failures) - Business rule violations - Permanent failures (resource not found) ### Idempotency Requirements **Why Critical** (Source: docs.temporal.io/activities): - Activities may execute multiple times - Network failures trigger retries - Duplicate execution must be safe **Implementation Strategies**: - Idempotency keys (deduplication) - Check-then-act with unique constraints - Upsert operations instead of insert - Track processed request IDs ### Activity Heartbeats **Purpose**: Detect stalled long-running activities **Pattern**: - Activity sends periodic heartbeat - Includes progress information - Timeout if no heartbeat received - Enables progress-based retry ## Best Practices ### Workflow Design 1. **Keep workflows focused** - Single responsibility per workflow 2. **Small workflows** - Use child workflows for scalability 3. **Clear boundaries** - Workflow orchestrates, activities execute 4. **Test locally** - Use time-skipping test environment ### Activity Design 1. **Idempotent operations** - Safe to retry 2. **Short-lived** - Seconds to minutes, not hours 3. **Timeout configuration** - Always set timeouts 4. **Heartbeat for lon
Related in Design
contribute
IncludedLocal-only OSS contribution command center. Auto-refreshes the user's in-flight PR and issue state on invoke so conversations start with full context — no need to brief Claude on what's in flight. Helps the user find issues to contribute to on GitHub, builds per-repo dossiers of what each upstream expects (CLA, DCO, branch convention, AI policy, draft-first, review bots, issue templates), runs deterministic gates before any external action so AI-assisted contributions don't reach maintainers as slop. State is markdown-only: candidate files at ~/.contribute-system/candidates/, repo dossiers at ~/.contribute-system/research/, append-only event log at ~/.contribute-system/log.jsonl. No database, no cloud calls. Use when the user asks about their PRs / issues / contributions, wants to find new work to take on, claim an issue, build/refresh a repo's dossier, or draft a Design Issue or PR. Trigger with "/contribute", "what's my PR status", "find a contribution", "claim issue X", "draft a Design Issue for Y", "refresh dossier for Z".
architectural-analysis
IncludedUser-triggered deep architectural analysis of a codebase or scoped subtree across eight modes — information architecture, data flow, integration points, UI surfaces, interaction patterns, data model, control flow, and failure modes. This skill should be used when the user asks to "diagram this codebase," "map the architecture," "show the data flow," "give me an ERD," "trace control flow," "find the integration points," "verify the layout pattern," "audit the UX architecture," or any similar request whose primary deliverable is mermaid diagrams plus cited reports under docs/architecture/. Dispatches haiku/sonnet sub-agents in parallel for per-mode exploration, then verifies every citation mechanically before any node lands in a diagram. Not for one-off prose explanations of code (use code-explanation) or for high-level system design from scratch (use system-design).
mcp
IncludedModel Context Protocol (MCP) server development and tool management. Languages: Python, TypeScript. Capabilities: build MCP servers, integrate external APIs, discover/execute MCP tools, manage multi-server configs, design agent-centric tools. Actions: create, build, integrate, discover, execute, configure MCP servers/tools. Keywords: MCP, Model Context Protocol, MCP server, MCP tool, stdio transport, SSE transport, tool discovery, resource provider, prompt template, external API integration, Gemini CLI MCP, Claude MCP, agent tools, tool execution, server config. Use when: building MCP servers, integrating external APIs as MCP tools, discovering available MCP tools, executing MCP capabilities, configuring multi-server setups, designing tools for AI agents.
react-native-skia
IncludedDesign, build, debug, and optimise high-polish animated graphics in React Native or Expo using @shopify/react-native-skia, Reanimated, and Gesture Handler. Use when the user wants canvas-driven UI, shaders, paths, rich text, image filters, sprite fields, Skottie, video frames, snapshots, web CanvasKit setup, or performance tuning for custom motion-heavy elements such as loaders, hero art, cards, charts, progress indicators, particle systems, or gesture-driven surfaces. Also use when the user asks for fluid, glow, glass, blob, parallax, 60fps/120fps, or GPU-friendly animated effects in React Native, even if they do not explicitly say "Skia". Do not use for ordinary form/layout work with standard views.
plaid
IncludedProduct Led AI Development — guides founders from idea to launched product. Six capabilities: Idea (discover a product idea), Validate (pressure-test the idea against fatal flaws, problem reality, competition, and 2-week MVP feasibility), Plan (vision intake + document generation), Design (translate image references into a design.md spec), Launch (go-to-market strategy), and Build (roadmap execution). Use when someone says "PLAID", "plaid idea", "help me find an idea", "product idea", "idea from my business", "idea from my expertise", "plaid validate", "validate my idea", "pressure-test", "is this idea good", "find fatal flaws", "validate the problem", "plan a product", "define my vision", "generate a PRD", "product strategy", "plaid design", "design from image", "translate image to design", "create design.md", "extract design tokens", "plaid launch", "go-to-market", "launch plan", "GTM strategy", "launch playbook", "plaid build", "build the app", "start building", or "execute the roadmap".
nextjs-framer-motion-animations
IncludedAdds production-safe Motion for React or Framer Motion animations to Next.js apps, including reveal, hover and tap micro-interactions, whileInView, stagger, AnimatePresence, layout and layoutId transitions, reorder, scroll-linked UI, and lightweight route-content transitions. Use when the user asks to add, refactor, or debug Motion or Framer Motion in App Router or Pages Router codebases, especially around server/client boundaries, reduced motion, LazyMotion, bundle size, hydration, or route transitions. Avoid for GSAP-style timelines, WebGL or 3D scenes, heavy scroll storytelling, or CSS-only effects unless Motion is explicitly requested.