flowio
Parse FCS (Flow Cytometry Standard) files v2.0-3.1. Extract events as NumPy arrays, read metadata/channels, convert to CSV/DataFrame, for flow cytometry data preprocessing.
What this skill does
# FlowIO: Flow Cytometry Standard File Handler
## Overview
FlowIO is a lightweight Python library for reading and writing Flow Cytometry Standard (FCS) files. Parse FCS metadata, extract event data, and create new FCS files with minimal dependencies. The library supports FCS versions 2.0, 3.0, and 3.1, making it ideal for backend services, data pipelines, and basic cytometry file operations.
## When to Use This Skill
This skill should be used when:
- FCS files requiring parsing or metadata extraction
- Flow cytometry data needing conversion to NumPy arrays
- Event data requiring export to FCS format
- Multi-dataset FCS files needing separation
- Channel information extraction (scatter, fluorescence, time)
- Cytometry file validation or inspection
- Pre-processing workflows before advanced analysis
**Related Tools:** For advanced flow cytometry analysis including compensation, gating, and FlowJo/GatingML support, recommend FlowKit library as a companion to FlowIO.
## Installation
```bash
uv pip install flowio
```
Requires Python 3.9 or later.
## Quick Start
### Basic File Reading
```python
from flowio import FlowData
# Read FCS file
flow_data = FlowData('experiment.fcs')
# Access basic information
print(f"FCS Version: {flow_data.version}")
print(f"Events: {flow_data.event_count}")
print(f"Channels: {flow_data.pnn_labels}")
# Get event data as NumPy array
events = flow_data.as_array() # Shape: (events, channels)
```
### Creating FCS Files
```python
import numpy as np
from flowio import create_fcs
# Prepare data
data = np.array([[100, 200, 50], [150, 180, 60]]) # 2 events, 3 channels
channels = ['FSC-A', 'SSC-A', 'FL1-A']
# Create FCS file
create_fcs('output.fcs', data, channels)
```
## Core Workflows
### Reading and Parsing FCS Files
The FlowData class provides the primary interface for reading FCS files.
**Standard Reading:**
```python
from flowio import FlowData
# Basic reading
flow = FlowData('sample.fcs')
# Access attributes
version = flow.version # '3.0', '3.1', etc.
event_count = flow.event_count # Number of events
channel_count = flow.channel_count # Number of channels
pnn_labels = flow.pnn_labels # Short channel names
pns_labels = flow.pns_labels # Descriptive stain names
# Get event data
events = flow.as_array() # Preprocessed (gain, log scaling applied)
raw_events = flow.as_array(preprocess=False) # Raw data
```
**Memory-Efficient Metadata Reading:**
When only metadata is needed (no event data):
```python
# Only parse TEXT segment, skip DATA and ANALYSIS
flow = FlowData('sample.fcs', only_text=True)
# Access metadata
metadata = flow.text # Dictionary of TEXT segment keywords
print(metadata.get('$DATE')) # Acquisition date
print(metadata.get('$CYT')) # Instrument name
```
**Handling Problematic Files:**
Some FCS files have offset discrepancies or errors:
```python
# Ignore offset discrepancies between HEADER and TEXT sections
flow = FlowData('problematic.fcs', ignore_offset_discrepancy=True)
# Use HEADER offsets instead of TEXT offsets
flow = FlowData('problematic.fcs', use_header_offsets=True)
# Ignore offset errors entirely
flow = FlowData('problematic.fcs', ignore_offset_error=True)
```
**Excluding Null Channels:**
```python
# Exclude specific channels during parsing
flow = FlowData('sample.fcs', null_channel_list=['Time', 'Null'])
```
### Extracting Metadata and Channel Information
FCS files contain rich metadata in the TEXT segment.
**Common Metadata Keywords:**
```python
flow = FlowData('sample.fcs')
# File-level metadata
text_dict = flow.text
acquisition_date = text_dict.get('$DATE', 'Unknown')
instrument = text_dict.get('$CYT', 'Unknown')
data_type = flow.data_type # 'I', 'F', 'D', 'A'
# Channel metadata
for i in range(flow.channel_count):
pnn = flow.pnn_labels[i] # Short name (e.g., 'FSC-A')
pns = flow.pns_labels[i] # Descriptive name (e.g., 'Forward Scatter')
pnr = flow.pnr_values[i] # Range/max value
print(f"Channel {i}: {pnn} ({pns}), Range: {pnr}")
```
**Channel Type Identification:**
FlowIO automatically categorizes channels:
```python
# Get indices by channel type
scatter_idx = flow.scatter_indices # [0, 1] for FSC, SSC
fluoro_idx = flow.fluoro_indices # [2, 3, 4] for FL channels
time_idx = flow.time_index # Index of time channel (or None)
# Access specific channel types
events = flow.as_array()
scatter_data = events[:, scatter_idx]
fluorescence_data = events[:, fluoro_idx]
```
**ANALYSIS Segment:**
If present, access processed results:
```python
if flow.analysis:
analysis_keywords = flow.analysis # Dictionary of ANALYSIS keywords
print(analysis_keywords)
```
### Creating New FCS Files
Generate FCS files from NumPy arrays or other data sources.
**Basic Creation:**
```python
import numpy as np
from flowio import create_fcs
# Create event data (rows=events, columns=channels)
events = np.random.rand(10000, 5) * 1000
# Define channel names
channel_names = ['FSC-A', 'SSC-A', 'FL1-A', 'FL2-A', 'Time']
# Create FCS file
create_fcs('output.fcs', events, channel_names)
```
**With Descriptive Channel Names:**
```python
# Add optional descriptive names (PnS)
channel_names = ['FSC-A', 'SSC-A', 'FL1-A', 'FL2-A', 'Time']
descriptive_names = ['Forward Scatter', 'Side Scatter', 'FITC', 'PE', 'Time']
create_fcs('output.fcs',
events,
channel_names,
opt_channel_names=descriptive_names)
```
**With Custom Metadata:**
```python
# Add TEXT segment metadata
metadata = {
'$SRC': 'Python script',
'$DATE': '19-OCT-2025',
'$CYT': 'Synthetic Instrument',
'$INST': 'Laboratory A'
}
create_fcs('output.fcs',
events,
channel_names,
opt_channel_names=descriptive_names,
metadata=metadata)
```
**Note:** FlowIO exports as FCS 3.1 with single-precision floating-point data.
### Exporting Modified Data
Modify existing FCS files and re-export them.
**Approach 1: Using write_fcs() Method:**
```python
from flowio import FlowData
# Read original file
flow = FlowData('original.fcs')
# Write with updated metadata
flow.write_fcs('modified.fcs', metadata={'$SRC': 'Modified data'})
```
**Approach 2: Extract, Modify, and Recreate:**
For modifying event data:
```python
from flowio import FlowData, create_fcs
# Read and extract data
flow = FlowData('original.fcs')
events = flow.as_array(preprocess=False)
# Modify event data
events[:, 0] = events[:, 0] * 1.5 # Scale first channel
# Create new FCS file with modified data
create_fcs('modified.fcs',
events,
flow.pnn_labels,
opt_channel_names=flow.pns_labels,
metadata=flow.text)
```
### Handling Multi-Dataset FCS Files
Some FCS files contain multiple datasets in a single file.
**Detecting Multi-Dataset Files:**
```python
from flowio import FlowData, MultipleDataSetsError
try:
flow = FlowData('sample.fcs')
except MultipleDataSetsError:
print("File contains multiple datasets")
# Use read_multiple_data_sets() instead
```
**Reading All Datasets:**
```python
from flowio import read_multiple_data_sets
# Read all datasets from file
datasets = read_multiple_data_sets('multi_dataset.fcs')
print(f"Found {len(datasets)} datasets")
# Process each dataset
for i, dataset in enumerate(datasets):
print(f"\nDataset {i}:")
print(f" Events: {dataset.event_count}")
print(f" Channels: {dataset.pnn_labels}")
# Get event data for this dataset
events = dataset.as_array()
print(f" Shape: {events.shape}")
print(f" Mean values: {events.mean(axis=0)}")
```
**Reading Specific Dataset:**
```python
from flowio import FlowData
# Read first dataset (nextdata_offset=0)
first_dataset = FlowData('multi.fcs', nextdata_offset=0)
# Read second dataset using NEXTDATA offset from first
next_offset = int(first_dataset.text['$NEXTDATA'])
if next_offset > 0:
second_dataset = FlowData('multi.fcsRelated in Data & Analytics
clawarr-suite
IncludedComprehensive management for self-hosted media stacks (Sonarr, Radarr, Lidarr, Readarr, Prowlarr, Bazarr, Overseerr, Plex, Tautulli, SABnzbd, Recyclarr, Unpackerr, Notifiarr, Maintainerr, Kometa, FlareSolverr). Deep library exploration, analytics, dashboard generation, content management, request handling, subtitle management, indexer control, download monitoring, quality profile sync, library cleanup automation, notification routing, collection/overlay management, and media tracker integration (Trakt, Letterboxd, Simkl).
querying-soql
IncludedSOQL query generation, optimization, and analysis with 100-point scoring. Use this skill when the user needs SOQL/SOSL authoring or optimization: natural-language-to-query generation, relationship queries, aggregates, query-plan analysis, and performance or safety improvements for Salesforce queries. TRIGGER when: user writes, optimizes, or debugs SOQL/SOSL queries, touches .soql files, or asks about relationship queries, aggregates, or query performance. DO NOT TRIGGER when: bulk data operations (use handling-sf-data), Apex DML logic (use generating-apex), or report/dashboard queries.
app-store-optimization
IncludedApp Store Optimization (ASO) toolkit for researching keywords, analyzing competitor rankings, generating metadata suggestions, and improving app visibility on Apple App Store and Google Play Store. Use when the user asks about ASO, app store rankings, app metadata, app titles and descriptions, app store listings, app visibility, or mobile app marketing on iOS or Android. Supports keyword research and scoring, competitor keyword analysis, metadata optimization, A/B test planning, launch checklists, and tracking ranking changes.
habit-flow
IncludedAI-powered atomic habit tracker with natural language logging, streak tracking, smart reminders, and coaching. Use for creating habits, logging completions naturally ("I meditated today"), viewing progress, and getting personalized coaching.
app-store-optimization
IncludedApp Store Optimization (ASO) toolkit for researching keywords, analyzing competitor rankings, generating metadata suggestions, and improving app visibility on Apple App Store and Google Play Store. Use when the user asks about ASO, app store rankings, app metadata, app titles and descriptions, app store listings, app visibility, or mobile app marketing on iOS or Android. Supports keyword research and scoring, competitor keyword analysis, metadata optimization, A/B test planning, launch checklists, and tracking ranking changes.
visualizing-data
IncludedBuilds dashboards, reports, and data-driven interfaces requiring charts, graphs, or visual analytics. Provides systematic framework for selecting appropriate visualizations based on data characteristics and analytical purpose. Includes 24+ visualization types organized by purpose (trends, comparisons, distributions, relationships, flows, hierarchies, geospatial), accessibility patterns (WCAG 2.1 AA compliance), colorblind-safe palettes, and performance optimization strategies. Use when creating visualizations, choosing chart types, displaying data graphically, or designing data interfaces.