langchain-enterprise-rbac
Enforce tenant isolation and role-based access across LangChain 1.0 chains and LangGraph 1.0 agents — per-request retriever construction, tenant-scoped rate limits, role-scoped tool allowlists, and structured audit logs. Use when building multi-tenant saas, passing soc2 review, or debugging cross-tenant leak. Trigger with "langchain multi-tenant", "langchain tenant isolation", "langchain rbac", "langchain row-level security", "langchain audit log".
What this skill does
# LangChain Enterprise RBAC (Python)
## Overview
A B2B SaaS team shipped their first RAG feature for two tenants. The factory
code looked innocent: build `PineconeVectorStore` once at module import with
`namespace="acme-corp"` (the first tenant), convert it to a retriever, store
it in a module global, reuse on every request. Six weeks later tenant "Initech"
went live. Their first search returned three documents from Acme Corp.
The singleton retriever had captured the Acme namespace at process start.
`RunnableConfig.configurable["tenant_id"]` was being passed in — but the
retriever never read it, because the filter was baked in. Every request for
every tenant hit the same Pinecone namespace. Security review caught it three
days later and put a hold on the SOC2 renewal. This is pain-catalog entry
**P33**, the single most common cause of cross-tenant leak in LangChain 1.0
production.
This skill fixes it with four workstreams:
- **P33 — retriever-per-request factory** — build the retriever inside the
chain or agent invocation, keyed by `tenant_id` from `RunnableConfig`. Never
at module scope. Unit-test with two tenants and assert non-overlap.
- **Role-scoped tool allowlist** — build the agent per-request with only the
tools the current user's role permits. Forbidden tools are not passed to
`create_agent` at all, so the model cannot call them even if it tries.
- **Per-tenant rate limit + budget** — scope the `InMemoryRateLimiter` (or a
Redis-backed equivalent) by `tenant_id`, and check a per-tenant USD budget
before invoking the model.
- **Structured audit log** — JSON log with `user_id`, `tenant_id`,
`chain_name`, `tools_called`, `cost_usd`, `outcome`, emitted in both success
and failure paths. Ships to SIEM or BigQuery.
Two failure patterns anchor this skill: **import-time retriever binding (P33)**
and **missing audit log on tool failure** (the `try` block logs on success but
the `except` branch re-raises without emitting, so incident response has no
record of denied tool calls). Pinned: `langchain-core 1.0.x`,
`langgraph 1.0.x`, `langchain-anthropic 1.0.x`, `langchain-openai 1.0.x`,
`langchain-postgres 0.0.15+` (for PGVector RLS), `pinecone-client 5.x`,
`chromadb 0.5.x`. Pain-catalog anchors: **P33 primary**, P18, P24, P31, P37.
## Prerequisites
- Python 3.10+
- `langchain-core >= 1.0, < 2.0`, `langgraph >= 1.0, < 2.0`
- At least one vector-store backend: `pinecone-client`, `langchain-postgres`
(PGVector), `chromadb`, or `faiss-cpu` (single-tenant only — see §Step 2)
- A structured logging sink: stdout JSON for local, Cloud Logging / Datadog /
Splunk HEC / BigQuery streaming insert for production
- A tenant authorization claim in every request (JWT `tid` claim, session
cookie, or header — the auth boundary is out of scope for this skill but
assumed correct)
## Instructions
### Step 1 — Build the retriever per-request, never at import
Move retriever construction inside the chain or agent invocation, keyed by
`config["configurable"]["tenant_id"]`.
```python
from langchain_core.runnables import RunnableConfig, RunnableLambda
from langchain_pinecone import PineconeVectorStore
# WRONG — retriever bound at import time with first tenant's namespace.
# RETRIEVER = PineconeVectorStore(index_name="rag", namespace="acme-corp",
# embedding=emb).as_retriever(search_kwargs={"k": 4})
# RIGHT — factory called per-request, reads tenant from RunnableConfig.
def retriever_for(config: RunnableConfig):
tenant_id = config["configurable"]["tenant_id"] # required, no default
if not tenant_id:
raise PermissionError("tenant_id missing from RunnableConfig")
store = PineconeVectorStore(
index_name="rag",
namespace=tenant_id, # P33 fix — namespace per-invocation
embedding=emb,
)
return store.as_retriever(search_kwargs={"k": 4})
def retrieve(inputs: dict, config: RunnableConfig):
return retriever_for(config).invoke(inputs["query"])
chain = RunnableLambda(retrieve) | prompt | model
result = chain.invoke({"query": "..."},
config={"configurable": {"tenant_id": "initech"}})
```
No default on `tenant_id` — a missing tenant must be a hard error, not a silent
fallback. See [Retriever-per-request](references/retriever-per-request.md) for
factory lifecycle and PGVector RLS / Chroma / FAISS adapters.
### Step 2 — Pick a vector-store isolation primitive
| Store | Isolation primitive | Per-tenant latency | Max tenants | Safety notes |
|---|---|---|---|---|
| **Pinecone** | `namespace=tenant_id` per query | ~40ms p50 (shared index) | 100,000+ per index | Namespace is the documented isolation boundary; still apply metadata `{"tenant_id": tid}` as defense in depth |
| **PGVector** | Postgres row-level security (RLS) on `tenant_id` column | ~20ms p50 (HNSW index) | Bounded by Postgres row count | Use `SET LOCAL app.tenant_id = :tid` per transaction; RLS policy `USING (tenant_id = current_setting('app.tenant_id'))` |
| **Chroma** | Collection-per-tenant (`get_or_create_collection(name=tid)`) | ~30ms p50 | ~1,000 before metadata overhead | Good isolation at small scale; collection creation is synchronous — provision lazily but cache the handle per-request |
| **FAISS** | In-process index-per-tenant | ~5ms p50 (in-memory) | 10-50 practical limit | Poor fit for multi-tenant SaaS — cannot shard across processes, reloads on every deploy, no durable filter. Use for single-tenant evaluation only |
Pinecone scales highest. PGVector with RLS is the strongest primitive when
isolation must be auditable at the database layer (RLS is enforced by the
server even if application code is bypassed). Chroma is fine for ≤1,000
tenants. FAISS is not a multi-tenant production choice — document so future
engineers do not adopt it. See
[Vector-store isolation](references/vector-store-isolation.md) for the PGVector
RLS DDL and Chroma lifecycle.
### Step 3 — Build the agent per-request with a role-scoped tool allowlist
Never bind every tool to every agent and trust the model to pick the right one.
Build the agent inside the request handler with only the tools the user's role
permits.
```python
from langgraph.prebuilt import create_agent
# Map role -> allowed tool names. Owned by IAM config, not the skill.
ROLE_TOOLS: dict[str, set[str]] = {
"viewer": {"search_docs"},
"editor": {"search_docs", "create_note"},
"admin": {"search_docs", "create_note", "delete_note", "export_audit"},
}
ALL_TOOLS = {t.name: t for t in [search_docs, create_note, delete_note, export_audit]}
def agent_for(user_role: str, tenant_id: str):
allowed = ROLE_TOOLS.get(user_role, set())
tools = [ALL_TOOLS[n] for n in allowed if n in ALL_TOOLS]
# Forbidden tools are not passed in — the model never sees them.
return create_agent(model, tools=tools)
agent = agent_for(user_role="viewer", tenant_id="initech")
# viewer has no delete_note -> the agent cannot call it.
```
Add a denylist for dangerous argument patterns (SQL with `DROP` / `TRUNCATE`,
shell with `rm -rf` / `sudo`, URLs to internal metadata endpoints) via a
`pre_model_hook` or tool wrapper — allowlist bounds *which* tools run,
denylist bounds *what arguments* they accept. See
[Role-scoped tool allowlist](references/role-scoped-tool-allowlist.md).
### Step 4 — Scope rate limits and budgets by tenant
Per-tenant limits prevent one tenant's runaway job from exhausting shared model
capacity. Rate-limit detail is covered in `langchain-rate-limits`; cost-budget
detail in `langchain-cost-tuning`. The only RBAC-specific requirement here:
**limiter key must include `tenant_id`** — never a process-global singleton.
```python
# Sketch only — see langchain-rate-limits for production implementation.
_limiters: dict[str, InMemoryRateLimiter] = {}
def limiter_for(tenant_id: str) -> InMemoryRateLimiter:
if tenant_id not in _limiters:
# Tier lookup — different plans get different budgets.
rps = TENANT_TIER_LIMITS.get(tenant_id, 1.0) # 1.0 rps = freRelated in AI Agents
skill-development
IncludedComprehensive meta-skill for creating, managing, validating, auditing, and distributing Claude Code skills and slash commands (unified in v2.1.3+). Provides skill templates, creation workflows, validation patterns, audit checklists, naming conventions, YAML frontmatter guidance, progressive disclosure examples, and best practices lookup. Use when creating new skills, validating existing skills, auditing skill quality, understanding skill architecture, needing skill templates, learning about YAML frontmatter requirements, progressive disclosure patterns, tool restrictions (allowed-tools), skill composition, skill naming conventions, troubleshooting skill activation issues, creating custom slash commands, configuring command frontmatter, using command arguments ($ARGUMENTS, $1, $2), bash execution in commands, file references in commands, command namespacing, plugin commands, MCP slash commands, Skill tool configuration, or deciding between skills vs slash commands. Delegates to docs-management skill for official documentation.
reprompter
IncludedTransform messy prompts into well-structured, effective prompts — single or multi-agent. Use when: "reprompt", "reprompt this", "clean up this prompt", "structure my prompt", rough text needing XML tags and best practices, "reprompter teams", "repromptception", "run with quality", "smart run", "smart agents", multi-agent tasks, audits, parallel work, anything going to agent teams. Don't use when: simple Q&A, pure chat, immediate execution-only tasks. See "Don't Use When" section for details. Outputs: Structured XML/Markdown prompt, quality score (before/after), optional team brief + per-agent sub-prompts, agent team output files. Success criteria: Single mode quality score ≥ 7/10; Repromptception per-agent prompt quality score 8+/10; all required sections present, actionable and specific.
adaptive-compaction
IncludedAdaptive add-on policy and recovery layer that decides WHEN to compact, prune, snapshot, or fork -- replacing fixed-percent auto-compaction across Claude Code, Codex, and MCP-capable hosts. Trigger on auto-compact timing or damage: "when should I compact", "is it safe to compact now or start a fresh session", "auto-compact fires too early/mid-task", "switching to an unrelated task but the window still has space", "context rot", "answers get worse the longer the session runs", "the agent forgot the plan or my decisions after it summarized", "add a layer on top that manages context without changing the agent", raising autoCompactWindow to give the policy room, or installing/tuning a cross-tool compaction policy or PreCompact hook -- even when "compaction" is never said but the problem is context-window pressure or post-summarization memory loss. Do NOT use to summarize a conversation, build RAG, write a summarization prompt (decides WHEN not HOW), or answer max-context-length trivia.
agent-skill-creator
IncludedCreate cross-platform agent skills from workflow descriptions. Activates when users ask to create an agent, automate a repetitive workflow, create a custom skill, or need advanced agent creation. Triggers on phrases like create agent for, automate workflow, create skill for, every day I have to, daily I need to, turn process into agent, need to automate, create a cross-platform skill, validate this skill, export this skill, migrate this skill. Supports single skills, multi-agent suites, transcript processing, template-based creation, interactive configuration, cross-platform export, and spec validation.
llm-wiki
IncludedUse when building or maintaining a persistent personal knowledge base (second brain) in Obsidian where an LLM incrementally ingests sources, updates entity/concept pages, maintains cross-references, and keeps a synthesis current. Triggers include "second brain", "Obsidian wiki", "personal knowledge management", "ingest this paper/article/book", "build a research wiki", "compound knowledge", "Memex", or whenever the user wants knowledge to accumulate across sessions instead of being re-derived by RAG on every query.
skill-master
IncludedAgent Skills authoring, evaluation, and optimization. Create, edit, validate, benchmark, and improve skills following the agentskills.io specification. Use when designing SKILL.md files, structuring skill folders (references, scripts, assets), ingesting external documentation into skills, running trigger evals, benchmarking skill quality, optimizing descriptions, or performing blind A/B comparisons. Keywords: agentskills.io, SKILL.md, skill authoring, eval, benchmark, trigger optimization.