Claude
Skills
Sign in
Back

futuresearch-python

Included with Lifetime
$97 forever

Use when the user wants Claude to dispatch researchers to forecast, score, classify, or add to a dataset at scale.

General

What this skill does


# FutureSearch Python SDK

FutureSearch gives Claude a research team for your data. Use this skill when writing Python code that needs to:

> **Documentation**: For detailed guides, case studies, and API reference, see:
> - Docs site: [futuresearch.ai/docs](https://futuresearch.ai/docs)
> - GitHub: [github.com/futuresearch/futuresearch-python](https://github.com/futuresearch/futuresearch-python)

**Operations:**
- Classify rows into predefined categories
- Rank/score rows based on qualitative criteria
- Deduplicate data using semantic understanding
- Merge tables using AI-powered matching
- Forecast probabilities for binary questions
- Run AI agents over dataframe rows

## Installation

### Python SDK

```bash
pip install futuresearch
```

### MCP Server (for Claude Code, Claude Desktop, Cursor, etc.)

If an MCP server is available (`futuresearch_classify`, `futuresearch_rank`, etc. tools), you can use it directly without writing Python code. The MCP server operates on uploaded data (via artifact IDs or inline JSON).

To install the MCP server, add to your MCP config:

```json
{
  "mcpServers": {
    "futuresearch": {
      "type": "http",
      "url": "https://mcp.futuresearch.ai/mcp"
    }
  }
}
```

Config file locations:
- **Claude Code**: `~/.claude.json` (user) or `.mcp.json` (project)
- **Claude Desktop**: `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS)
- **Cursor**: `~/.cursor/mcp.json`

## When to Use SDK vs MCP

**Use MCP tools** when:
- Quick one-off operations on CSV files
- User wants direct results without writing code
- Simple lookups and enrichments

**Use Python SDK** when:
- Complex multi-step workflows (dedupe → merge → research)
- Custom data transformations
- Integration with existing Python scripts
- Full control over execution and intermediate results

---

# MCP Server Tools

If you have the FutureSearch MCP server configured, these 18 tools are available. All data processing tools accept input via `artifact_id` (from upload_data or request_upload_url) or `data` (inline JSON rows). Provide exactly one.

## Core Operations

### futuresearch_agent
Run web research agents on each row.
```
Parameters:
- task: (required) Natural language description of research task
- artifact_id: Artifact ID (UUID) from upload_data or request_upload_url
- data: Inline data as a list of row objects
- response_schema: (optional) JSON schema for per-row agent response
- session_id: (optional) Session UUID to resume
- session_name: (optional) Name for a new session
```

### futuresearch_single_agent
Run a single research agent on one input (no CSV needed).
```
Parameters:
- task: (required) Natural language task for the agent
- input_data: (optional) Context as key-value pairs (e.g. {"company": "Acme"})
- response_schema: (optional) JSON schema for the agent response
- session_id: (optional) Session UUID to resume
- session_name: (optional) Name for a new session
```

### futuresearch_rank
Score and sort rows based on qualitative criteria.
```
Parameters:
- task: (required) Natural language instructions for scoring a single row
- field_name: (required) Name of the score field to add
- artifact_id: Artifact ID (UUID) from upload_data or request_upload_url
- data: Inline data as a list of row objects
- field_type: (optional) "float" (default), "int", "str", or "bool"
- ascending_order: (optional) Sort ascending (default: true)
- response_schema: (optional) JSON schema for the response model
- session_id / session_name: (optional)
```

### futuresearch_dedupe
Remove duplicate rows using semantic equivalence.
```
Parameters:
- equivalence_relation: (required) Natural language description of what makes rows duplicates
- artifact_id: Artifact ID (UUID) from upload_data or request_upload_url
- data: Inline data as a list of row objects
- session_id / session_name: (optional)
```

### futuresearch_merge
Join two tables using intelligent entity matching (LEFT JOIN semantics).
```
Parameters:
- task: (required) Natural language description of how to match rows
- left_artifact_id / left_data: (required, exactly one) Left table — the table being enriched (all rows kept)
- right_artifact_id / right_data: (required, exactly one) Right table — lookup/reference (columns appended to matches)
- merge_on_left: (optional) Only set if you expect exact string matches or want to draw agent attention to a column
- merge_on_right: (optional) Same as merge_on_left for right table
- relationship_type: (optional) "many_to_one" (default), "one_to_one", "one_to_many", "many_to_many"
- use_web_search: (optional) "auto" (default), "yes", or "no"
- session_id / session_name: (optional)
```

### futuresearch_forecast
Forecast the probability of binary questions.
```
Parameters:
- artifact_id: Artifact ID (UUID) from upload_data or request_upload_url
- data: Inline data as a list of row objects (must include "question" column)
- context: (optional) Batch-level context for all questions
- session_id / session_name: (optional)
```

### futuresearch_classify
Classify each row into one of the provided categories.
```
Parameters:
- task: (required) Natural language classification instructions
- categories: (required) Allowed categories (minimum 2)
- artifact_id: Artifact ID (UUID) from upload_data or request_upload_url
- data: Inline data as a list of row objects
- classification_field: (optional) Output column name (default: "classification")
- include_reasoning: (optional) Include reasoning column (default: false)
- session_id / session_name: (optional)
```

## Data Management

### futuresearch_browse_lists
Browse available reference lists of well-known entities (S&P 500, FTSE 100, countries, universities, etc.).
```
Parameters:
- search: (optional) Search term to match list names
- category: (optional) Filter by category (e.g. "Finance", "Geography")
```

### futuresearch_use_list
Import a reference list into your session and save it as a CSV.
```
Parameters:
- artifact_id: (required) artifact_id from futuresearch_browse_lists results
```

### futuresearch_upload_data
Upload data from a URL or local file. Returns an artifact_id for use in processing tools.
```
Parameters:
- source: (required) HTTP(S) URL (Google Sheets supported) or local CSV path (stdio mode only)
- session_id / session_name: (optional)
```

### futuresearch_request_upload_url
Request a presigned URL to upload a local CSV file (HTTP mode only).
```
Parameters:
- filename: (required) Name of the file to upload (must end in .csv)
```
Steps: call this tool → execute the returned curl command → use the artifact_id from the response.

## Task Lifecycle

### futuresearch_progress
Check progress of a running task. Blocks briefly to limit polling rate.
```
Parameters:
- task_id: (required) Task ID returned by the operation tool
```
After receiving a status update, immediately call futuresearch_progress again unless the task is completed or failed.

### futuresearch_results
Retrieve results from a completed task.
```
Parameters:
- task_id: (required) Task ID of the completed task
- output_path: (stdio) Full path to output CSV (must end in .csv)
- offset: (http, optional) Row offset for pagination (default: 0)
- page_size: (http, optional) Number of rows to load into context (default: auto threshold based on row count)
```
Only call after futuresearch_progress reports status "completed".

### futuresearch_cancel
Cancel a running task.
```
Parameters:
- task_id: (required) Task ID to cancel
```

## Sessions & Account

### futuresearch_list_sessions
List sessions owned by the authenticated user (paginated).
```
Parameters:
- offset: (optional) Number of sessions to skip (default: 0)
- limit: (optional) Max sessions per page (default: 25, max: 1000)
```

### futuresearch_list_session_tasks
List all tasks in a session with their IDs, statuses, and types.
```
Parameters:
- session_id: (required) Session ID (UUID) to list tasks for
```

### futuresearch_balance
Check the current billing balance for the auth

Related in General