Claude
Skills
Sign in
Back

flowio

Included with Lifetime
$97 forever

Parse FCS (Flow Cytometry Standard) files v2.0-3.1. Extract events as NumPy arrays, read metadata/channels, convert to CSV/DataFrame, for flow cytometry data preprocessing.

Data & Analytics

What this skill does


# FlowIO: Flow Cytometry Standard File Handler

## Overview

FlowIO is a lightweight Python library for reading and writing Flow Cytometry Standard (FCS) files. Parse FCS metadata, extract event data, and create new FCS files with minimal dependencies. The library supports FCS versions 2.0, 3.0, and 3.1, making it ideal for backend services, data pipelines, and basic cytometry file operations.

## When to Use This Skill

This skill should be used when:

- FCS files requiring parsing or metadata extraction
- Flow cytometry data needing conversion to NumPy arrays
- Event data requiring export to FCS format
- Multi-dataset FCS files needing separation
- Channel information extraction (scatter, fluorescence, time)
- Cytometry file validation or inspection
- Pre-processing workflows before advanced analysis

**Related Tools:** For advanced flow cytometry analysis including compensation, gating, and FlowJo/GatingML support, recommend FlowKit library as a companion to FlowIO.

## Installation

```bash
uv pip install flowio
```

Requires Python 3.9 or later.

## Quick Start

### Basic File Reading

```python
from flowio import FlowData

# Read FCS file
flow_data = FlowData('experiment.fcs')

# Access basic information
print(f"FCS Version: {flow_data.version}")
print(f"Events: {flow_data.event_count}")
print(f"Channels: {flow_data.pnn_labels}")

# Get event data as NumPy array
events = flow_data.as_array()  # Shape: (events, channels)
```

### Creating FCS Files

```python
import numpy as np
from flowio import create_fcs

# Prepare data
data = np.array([[100, 200, 50], [150, 180, 60]])  # 2 events, 3 channels
channels = ['FSC-A', 'SSC-A', 'FL1-A']

# Create FCS file
create_fcs('output.fcs', data, channels)
```

## Core Workflows

### Reading and Parsing FCS Files

The FlowData class provides the primary interface for reading FCS files.

**Standard Reading:**

```python
from flowio import FlowData

# Basic reading
flow = FlowData('sample.fcs')

# Access attributes
version = flow.version              # '3.0', '3.1', etc.
event_count = flow.event_count      # Number of events
channel_count = flow.channel_count  # Number of channels
pnn_labels = flow.pnn_labels        # Short channel names
pns_labels = flow.pns_labels        # Descriptive stain names

# Get event data
events = flow.as_array()            # Preprocessed (gain, log scaling applied)
raw_events = flow.as_array(preprocess=False)  # Raw data
```

**Memory-Efficient Metadata Reading:**

When only metadata is needed (no event data):

```python
# Only parse TEXT segment, skip DATA and ANALYSIS
flow = FlowData('sample.fcs', only_text=True)

# Access metadata
metadata = flow.text  # Dictionary of TEXT segment keywords
print(metadata.get('$DATE'))  # Acquisition date
print(metadata.get('$CYT'))   # Instrument name
```

**Handling Problematic Files:**

Some FCS files have offset discrepancies or errors:

```python
# Ignore offset discrepancies between HEADER and TEXT sections
flow = FlowData('problematic.fcs', ignore_offset_discrepancy=True)

# Use HEADER offsets instead of TEXT offsets
flow = FlowData('problematic.fcs', use_header_offsets=True)

# Ignore offset errors entirely
flow = FlowData('problematic.fcs', ignore_offset_error=True)
```

**Excluding Null Channels:**

```python
# Exclude specific channels during parsing
flow = FlowData('sample.fcs', null_channel_list=['Time', 'Null'])
```

### Extracting Metadata and Channel Information

FCS files contain rich metadata in the TEXT segment.

**Common Metadata Keywords:**

```python
flow = FlowData('sample.fcs')

# File-level metadata
text_dict = flow.text
acquisition_date = text_dict.get('$DATE', 'Unknown')
instrument = text_dict.get('$CYT', 'Unknown')
data_type = flow.data_type  # 'I', 'F', 'D', 'A'

# Channel metadata
for i in range(flow.channel_count):
    pnn = flow.pnn_labels[i]      # Short name (e.g., 'FSC-A')
    pns = flow.pns_labels[i]      # Descriptive name (e.g., 'Forward Scatter')
    pnr = flow.pnr_values[i]      # Range/max value
    print(f"Channel {i}: {pnn} ({pns}), Range: {pnr}")
```

**Channel Type Identification:**

FlowIO automatically categorizes channels:

```python
# Get indices by channel type
scatter_idx = flow.scatter_indices    # [0, 1] for FSC, SSC
fluoro_idx = flow.fluoro_indices      # [2, 3, 4] for FL channels
time_idx = flow.time_index            # Index of time channel (or None)

# Access specific channel types
events = flow.as_array()
scatter_data = events[:, scatter_idx]
fluorescence_data = events[:, fluoro_idx]
```

**ANALYSIS Segment:**

If present, access processed results:

```python
if flow.analysis:
    analysis_keywords = flow.analysis  # Dictionary of ANALYSIS keywords
    print(analysis_keywords)
```

### Creating New FCS Files

Generate FCS files from NumPy arrays or other data sources.

**Basic Creation:**

```python
import numpy as np
from flowio import create_fcs

# Create event data (rows=events, columns=channels)
events = np.random.rand(10000, 5) * 1000

# Define channel names
channel_names = ['FSC-A', 'SSC-A', 'FL1-A', 'FL2-A', 'Time']

# Create FCS file
create_fcs('output.fcs', events, channel_names)
```

**With Descriptive Channel Names:**

```python
# Add optional descriptive names (PnS)
channel_names = ['FSC-A', 'SSC-A', 'FL1-A', 'FL2-A', 'Time']
descriptive_names = ['Forward Scatter', 'Side Scatter', 'FITC', 'PE', 'Time']

create_fcs('output.fcs',
           events,
           channel_names,
           opt_channel_names=descriptive_names)
```

**With Custom Metadata:**

```python
# Add TEXT segment metadata
metadata = {
    '$SRC': 'Python script',
    '$DATE': '19-OCT-2025',
    '$CYT': 'Synthetic Instrument',
    '$INST': 'Laboratory A'
}

create_fcs('output.fcs',
           events,
           channel_names,
           opt_channel_names=descriptive_names,
           metadata=metadata)
```

**Note:** FlowIO exports as FCS 3.1 with single-precision floating-point data.

### Exporting Modified Data

Modify existing FCS files and re-export them.

**Approach 1: Using write_fcs() Method:**

```python
from flowio import FlowData

# Read original file
flow = FlowData('original.fcs')

# Write with updated metadata
flow.write_fcs('modified.fcs', metadata={'$SRC': 'Modified data'})
```

**Approach 2: Extract, Modify, and Recreate:**

For modifying event data:

```python
from flowio import FlowData, create_fcs

# Read and extract data
flow = FlowData('original.fcs')
events = flow.as_array(preprocess=False)

# Modify event data
events[:, 0] = events[:, 0] * 1.5  # Scale first channel

# Create new FCS file with modified data
create_fcs('modified.fcs',
           events,
           flow.pnn_labels,
           opt_channel_names=flow.pns_labels,
           metadata=flow.text)
```

### Handling Multi-Dataset FCS Files

Some FCS files contain multiple datasets in a single file.

**Detecting Multi-Dataset Files:**

```python
from flowio import FlowData, MultipleDataSetsError

try:
    flow = FlowData('sample.fcs')
except MultipleDataSetsError:
    print("File contains multiple datasets")
    # Use read_multiple_data_sets() instead
```

**Reading All Datasets:**

```python
from flowio import read_multiple_data_sets

# Read all datasets from file
datasets = read_multiple_data_sets('multi_dataset.fcs')

print(f"Found {len(datasets)} datasets")

# Process each dataset
for i, dataset in enumerate(datasets):
    print(f"\nDataset {i}:")
    print(f"  Events: {dataset.event_count}")
    print(f"  Channels: {dataset.pnn_labels}")

    # Get event data for this dataset
    events = dataset.as_array()
    print(f"  Shape: {events.shape}")
    print(f"  Mean values: {events.mean(axis=0)}")
```

**Reading Specific Dataset:**

```python
from flowio import FlowData

# Read first dataset (nextdata_offset=0)
first_dataset = FlowData('multi.fcs', nextdata_offset=0)

# Read second dataset using NEXTDATA offset from first
next_offset = int(first_dataset.text['$NEXTDATA'])
if next_offset > 0:
    second_dataset = FlowData('multi.fcs

Related in Data & Analytics