exa-core-workflow-b
Execute Exa findSimilar, getContents, answer, and streaming answer workflows. Use when finding pages similar to a URL, retrieving content for known URLs, or getting AI-generated answers with citations. Trigger with phrases like "exa find similar", "exa get contents", "exa answer", "exa similarity search", "findSimilarAndContents".
What this skill does
# Exa Core Workflow B — Similarity, Contents & Answer
## Overview
Secondary Exa workflow covering three endpoints beyond search: `findSimilar` (discover pages semantically related to a URL), `getContents` (retrieve text/highlights for known URLs), and `answer` (get AI-generated answers with web citations). These complement the primary search workflow in `exa-core-workflow-a`.
## Prerequisites
- `exa-js` installed and `EXA_API_KEY` configured
- Familiarity with `exa-core-workflow-a` search patterns
## Instructions
### Step 1: Find Similar Pages
```typescript
import Exa from "exa-js";
const exa = new Exa(process.env.EXA_API_KEY);
// findSimilar takes a URL (not a query string) and returns
// pages with semantically similar content
const similar = await exa.findSimilar(
"https://openai.com/research/gpt-4",
{
numResults: 10,
excludeSourceDomain: true, // exclude openai.com from results
startPublishedDate: "2024-01-01T00:00:00.000Z",
excludeDomains: ["reddit.com", "twitter.com"],
}
);
for (const r of similar.results) {
console.log(`${r.title} — ${r.url}`);
}
```
### Step 2: Find Similar with Contents
```typescript
// findSimilarAndContents combines similarity search + content extraction
const results = await exa.findSimilarAndContents(
"https://huggingface.co/blog/llama3",
{
numResults: 5,
text: { maxCharacters: 2000 },
highlights: { maxCharacters: 500, query: "open source LLM" },
excludeSourceDomain: true,
}
);
for (const r of results.results) {
console.log(`## ${r.title}`);
console.log(`URL: ${r.url}`);
console.log(`Highlights: ${r.highlights?.join(" | ")}`);
console.log(`Text preview: ${r.text?.substring(0, 300)}...\n`);
}
```
### Step 3: Get Contents for Known URLs
```typescript
// getContents retrieves page content for a list of URLs you already have
// Useful when you have URLs from a previous search or external source
const contents = await exa.getContents(
[
"https://arxiv.org/abs/2401.00001",
"https://arxiv.org/abs/2401.00002",
"https://blog.example.com/article",
],
{
text: { maxCharacters: 3000 },
highlights: { maxCharacters: 500 },
summary: { query: "key findings and methodology" },
livecrawl: "preferred", // try fresh, fall back to cache
livecrawlTimeout: 15000, // 15s timeout
// Subpage crawling: retrieve linked pages from each URL
subpages: 3, // crawl up to 3 subpages per URL
subpageTarget: "documentation", // find subpages matching this term
}
);
for (const r of contents.results) {
console.log(`${r.title}: ${r.text?.length || 0} chars`);
if (r.summary) console.log(`Summary: ${r.summary}`);
}
```
### Step 4: AI-Powered Answer with Citations
```typescript
// answer() searches the web and returns an AI-generated answer with sources
const answer = await exa.answer(
"What are the key differences between RAG and fine-tuning for LLMs?",
{
text: true,
// The answer response includes citations linking to source results
}
);
console.log("Answer:", answer.answer);
console.log("\nSources:");
for (const r of answer.results) {
console.log(` - ${r.title}: ${r.url}`);
}
```
### Step 5: Streaming Answer
```typescript
// streamAnswer returns chunks as they're generated
for await (const chunk of exa.streamAnswer(
"What is the current state of quantum computing in 2025?"
)) {
if (chunk.content) {
process.stdout.write(chunk.content);
}
if (chunk.citations) {
console.log("\n\nCitations:", JSON.stringify(chunk.citations, null, 2));
}
}
```
## Output
- Similar pages discovered from a seed URL
- Page content (text, highlights, summary) for known URLs
- AI-generated answers with web source citations
- Streaming answer chunks for real-time display
## Error Handling
| Error | HTTP Code | Cause | Solution |
|-------|-----------|-------|----------|
| `INVALID_URLS` | 400 | Malformed URLs in getContents | Validate URLs have protocol |
| `CRAWL_NOT_FOUND` | 404 | Content unavailable at URL | Verify URL is accessible |
| `CRAWL_TIMEOUT` | 504 | Live crawl exceeded timeout | Increase `livecrawlTimeout` |
| `SOURCE_NOT_AVAILABLE` | 403 | Paywalled or blocked content | Try without `livecrawl: "always"` |
| `UNABLE_TO_GENERATE_RESPONSE` | 501 | Insufficient data for answer | Rephrase query or add context |
| Empty `similar.results` | 200 | Seed URL not indexed | Try a more popular seed URL |
## Examples
### Competitive Intelligence Pipeline
```typescript
async function findCompetitors(companyUrl: string) {
// Find companies similar to a given company
const similar = await exa.findSimilarAndContents(companyUrl, {
numResults: 10,
excludeSourceDomain: true,
text: { maxCharacters: 500 },
category: "company",
});
return similar.results.map(r => ({
name: r.title,
url: r.url,
description: r.text?.substring(0, 200),
}));
}
```
### Batch URL Content Retrieval
```typescript
async function enrichUrls(urls: string[]) {
// Process URLs in batches to stay within rate limits
const batchSize = 10;
const allContents = [];
for (let i = 0; i < urls.length; i += batchSize) {
const batch = urls.slice(i, i + batchSize);
const contents = await exa.getContents(batch, {
text: { maxCharacters: 1500 },
summary: { query: "main topic and key points" },
});
allContents.push(...contents.results);
}
return allContents;
}
```
## Resources
- Exa Find Similar
- [Exa Get Contents](https://docs.exa.ai/reference/get-contents)
- [Exa Contents Retrieval](https://docs.exa.ai/reference/contents-retrieval)
## Next Steps
For common errors, see `exa-common-errors`. For SDK patterns, see `exa-sdk-patterns`.
Related in Writing & Docs
jax-development
IncludedUse this skill when the user is writing, debugging, profiling, refactoring, reviewing, benchmarking, parallelising, exporting, or explaining JAX code, or when they mention JAX, jax.numpy, jit, grad, value_and_grad, vmap, scan, lax, random keys, pytrees, jax.Array, sharding, Mesh, PartitionSpec, NamedSharding, pmap, shard_map, Pallas, XLA, StableHLO, checkify, profiler, or the JAX repo. It helps turn NumPy or PyTorch-style code into pure functional JAX, fix tracer/control-flow/shape/PRNG bugs, remove recompiles and host-device syncs, choose transforms and sharding strategies, inspect jaxpr/lowering/IR, and benchmark compiled code correctly.
nature-article-writer
IncludedDrafts, rewrites, diagnostically critiques, and style-calibrates primary research manuscripts for Nature and Nature Portfolio journals. Use when the user wants a Nature-style title, summary paragraph or abstract, introduction, results, discussion, methods, figure legends, presubmission enquiry, cover letter, reviewer response, or when a scientific draft sounds generic, jargon-heavy, structurally weak, or AI-ish and needs precise, broad-reader-friendly prose without inventing data, analyses, or references. Best for primary research articles and letters rather than reviews or press releases unless explicitly adapting one.
deckrd
IncludedDocument-driven framework that derives requirements, specifications, implementation plans, and executable tasks from goals through structured AI dialogue. Use when user says "write requirements", "create spec", "plan implementation", "derive tasks", "structure this feature", "break down into tasks", or "document this module". Also use for reverse engineering existing code into docs (/deckrd rev). Do NOT use for direct code writing — use /deckrd-coder after tasks are generated. Do NOT use when the user only wants to run or fix existing code without planning.
clinical-decision-support
IncludedGenerate professional clinical decision support (CDS) documents for pharmaceutical and clinical research settings, including patient cohort analyses (biomarker-stratified with outcomes) and treatment recommendation reports (evidence-based guidelines with decision algorithms). Supports GRADE evidence grading, statistical analysis (hazard ratios, survival curves, waterfall plots), biomarker integration, and regulatory compliance. Outputs publication-ready LaTeX/PDF format optimized for drug development, clinical research, and evidence synthesis.
handling-sf-data
IncludedSalesforce data operations with 130-point scoring. Use this skill to create, update, delete, bulk import/export, generate test data, and clean up org records using sf CLI and anonymous Apex. TRIGGER when: user creates test data, performs bulk import/export, uses sf data CLI commands, needs data factory patterns for Apex tests, or needs to seed/clean records in a Salesforce org. DO NOT TRIGGER when: SOQL query writing only (use querying-soql), Apex test execution (use running-apex-tests), or metadata deployment (use deploying-metadata).
accelint-ac-to-playwright
IncludedConvert and validate acceptance criteria for Playwright test automation. Use when user asks to (1) review/evaluate/check if AC are ready for automation, (2) assess if AC can be converted as-is, (3) validate AC quality for Playwright, (4) turn AC into tests, (5) generate tests from acceptance criteria, (6) convert .md bullets or .feature Gherkin files to Playwright specs, (7) create test automation from requirements. Handles both bullet-style markdown and Gherkin syntax with JSON test plan generation and validation.