Claude
Skills
Sign in
Back

setup-timescaledb-hypertables

Included with Lifetime
$97 forever

Use this skill when creating database schemas or tables for Timescale, TimescaleDB, TigerData, or Tiger Cloud, especially for time-series, IoT, metrics, events, or log data. Use this to improve the performance of any insert-heavy table. **Trigger when user asks to:** - Create or design SQL schemas/tables AND Timescale/TimescaleDB/TigerData/Tiger Cloud is available - Set up hypertables, compression, retention policies, or continuous aggregates - Configure partition columns, segment_by, order_by, or chunk intervals - Optimize time-series database performance or storage - Create tables for sensors, metrics, telemetry, events, or transaction logs **Keywords:** CREATE TABLE, hypertable, Timescale, TimescaleDB, time-series, IoT, metrics, sensor data, compression policy, continuous aggregates, columnstore, retention policy, chunk interval, segment_by, order_by Step-by-step instructions for hypertable creation, column selection, compression policies, retention, continuous aggregates, and indexes.

Design

What this skill does


# TimescaleDB Complete Setup

Instructions for insert-heavy data patterns where data is inserted but rarely changed:

- **Time-series data** (sensors, metrics, system monitoring)
- **Event logs** (user events, audit trails, application logs)
- **Transaction records** (orders, payments, financial transactions)
- **Sequential data** (records with auto-incrementing IDs and timestamps)
- **Append-only datasets** (immutable records, historical data)

## Step 1: Create Hypertable

```sql
CREATE TABLE your_table_name (
    timestamp TIMESTAMPTZ NOT NULL,
    entity_id TEXT NOT NULL,          -- device_id, user_id, symbol, etc.
    category TEXT,                    -- sensor_type, event_type, asset_class, etc.
    value_1 DOUBLE PRECISION,         -- price, temperature, latency, etc.
    value_2 DOUBLE PRECISION,         -- volume, humidity, throughput, etc.
    value_3 INTEGER,                  -- count, status, level, etc.
    metadata JSONB                    -- flexible additional data
) WITH (
    tsdb.hypertable,
    tsdb.partition_column='timestamp',
    tsdb.enable_columnstore=true,     -- Disable if table has vector columns
    tsdb.segmentby='entity_id',       -- See selection guide below
    tsdb.orderby='timestamp DESC',     -- See selection guide below
    tsdb.sparse_index='minmax(value_1),minmax(value_2),minmax(value_3)' -- see selection guide below
);
```

### Compression Decision

- **Enable by default** for insert-heavy patterns
- **Disable** if table has vector type columns (pgvector) - indexes on vector columns incompatible with columnstore

### Partition Column Selection

Must be time-based (TIMESTAMP/TIMESTAMPTZ/DATE) or integer (INT/BIGINT) with good temporal/sequential distribution.

**Common patterns:**

- TIME-SERIES: `timestamp`, `event_time`, `measured_at`
- EVENT LOGS: `event_time`, `created_at`, `logged_at`
- TRANSACTIONS: `created_at`, `transaction_time`, `processed_at`
- SEQUENTIAL: `id` (auto-increment when no timestamp), `sequence_number`
- APPEND-ONLY: `created_at`, `inserted_at`, `id`

**Less ideal:** `ingested_at` (when data entered system - use only if it's your primary query dimension)
**Avoid:** `updated_at` (breaks time ordering unless it's primary query dimension)

### Segment_By Column Selection

**PREFER SINGLE COLUMN** - multi-column rarely optimal. Multi-column can only work for highly correlated columns (e.g., metric_name + metric_type) with sufficient row density.

**Requirements:**

- Frequently used in WHERE clauses (most common filter)
- Good row density (>100 rows per value per chunk)
- Primary logical partition/grouping

**Examples:**

- IoT: `device_id`
- Finance: `symbol`
- Metrics: `service_name`, `service_name, metric_type` (if sufficient row density), `metric_name, metric_type` (if sufficient row density)
- Analytics: `user_id` if sufficient row density, otherwise `session_id`
- E-commerce: `product_id` if sufficient row density, otherwise `category_id`

**Row density guidelines:**

- Target: >100 rows per segment_by value within each chunk.
- Poor: <10 rows per segment_by value per chunk → choose less granular column
- What to do with low-density columns: prepend to order_by column list.

**Query pattern drives choice:**

```sql
SELECT * FROM table WHERE entity_id = 'X' AND timestamp > ...
-- ↳ segment_by: entity_id (if >100 rows per chunk)
```

**Avoid:** timestamps, unique IDs, low-density columns (<100 rows/value/chunk), columns rarely used in filtering

### Order_By Column Selection

Creates natural time-series progression when combined with segment_by for optimal compression.

**Most common:** `timestamp DESC`

**Examples:**

- IoT/Finance/E-commerce: `timestamp DESC`
- Metrics: `metric_name, timestamp DESC` (if metric_name has too low density for segment_by)
- Analytics: `user_id, timestamp DESC` (user_id has too low density for segment_by)

**Alternative patterns:**

- `sequence_id DESC` for event streams with sequence numbers
- `timestamp DESC, event_order DESC` for sub-ordering within same timestamp

**Low-density column handling:**
If a column has <100 rows per chunk (too low for segment_by), prepend it to order_by:

- Example: `metric_name` has 20 rows/chunk → use `segment_by='service_name'`, `order_by='metric_name, timestamp DESC'`
- Groups similar values together (all temperature readings, then pressure readings) for better compression

**Good test:** ordering created by `(segment_by_column, order_by_column)` should form a natural time-series progression. Values close to each other in the progression should be similar.

**Avoid in order_by:** random columns, columns with high variance between adjacent rows, columns unrelated to segment_by

### Compression Sparse Index Selection

**Sparse indexes** enable query filtering on compressed data without decompression. Store metadata per batch (~1000 rows) to eliminate batches that don't match query predicates.

**Types:**

- **minmax:** Min/max values per batch - for range queries (>, <, BETWEEN) on numeric/temporal columns

**Use minmax for:** price, temperature, measurement, timestamp (range filtering)

**Use for:**

- minmax for outlier detection (temperature > 90).
- minmax for fields that are highly correlated with segmentby and orderby columns (e.g. if orderby includes `created_at`, minmax on `updated_at` is useful).

**Avoid:** rarely filtered columns.

IMPORTANT: NEVER index columns in segmentby or orderby. Orderby columns will always have minmax indexes without any configuration.

**Configuration:**
The format is a comma-separated list of type_of_index(column_name).

```sql
ALTER TABLE table_name SET (
    timescaledb.sparse_index = 'minmax(value_1),minmax(value_2)'
);
```

Explicit configuration available since v2.22.0 (was auto-created since v2.16.0).

### Chunk Time Interval (Optional)

Default: 7 days (use if volume unknown, or ask user). Adjust based on volume:

- High frequency: 1 hour - 1 day
- Medium: 1 day - 1 week
- Low: 1 week - 1 month

```sql
SELECT set_chunk_time_interval('your_table_name', INTERVAL '1 day');
```

**Good test:** recent chunk indexes should fit in less than 25% of RAM.

### Indexes & Primary Keys

Common index patterns - composite indexes on an id and timestamp:

```sql
CREATE INDEX idx_entity_timestamp ON your_table_name (entity_id, timestamp DESC);
```

**Important:** Only create indexes you'll actually use - each has maintenance overhead.

**Primary key and unique constraints rules:** Must include partition column.

**Option 1: Composite PK with partition column**

```sql
ALTER TABLE your_table_name ADD PRIMARY KEY (entity_id, timestamp);
```

**Option 2: Single-column PK (only if it's the partition column)**

```sql
CREATE TABLE ... (id BIGINT PRIMARY KEY, ...) WITH (tsdb.partition_column='id');
```

**Option 3: No PK**: strict uniqueness is often not required for insert-heavy patterns.

## Step 2: Compression Policy (Optional)

**IMPORTANT**: If you used `tsdb.enable_columnstore=true` in Step 1, starting with TimescaleDB version 2.23 a columnstore policy is **automatically created** with `after => INTERVAL '7 days'`. You only need to call `add_columnstore_policy()` if you want to customize the `after` interval to something other than 7 days.

Set `after` interval for when: data becomes mostly immutable (some updates/backfill OK) AND B-tree indexes aren't needed for queries (less common criterion).

```sql
-- In TimescaleDB 2.23 and later only needed if you want to override the default 7-day policy created by tsdb.enable_columnstore=true
-- Remove the existing auto-created policy first:
-- CALL remove_columnstore_policy('your_table_name');
-- Then add custom policy:
-- CALL add_columnstore_policy('your_table_name', after => INTERVAL '1 day');
```

## Step 3: Retention Policy

IMPORTANT: Don't guess - ask user or comment out if unknown.

```sql
-- Example - replace with requirements or comment out
SELECT add_retention_policy('your_table_name', INTERVAL '365 days');
```

## Step 4: Create Continuous A

Related in Design