mdanalysis

Included with Lifetime

$97 forever

Comprehensive guide for MDAnalysis - the Python library for analyzing molecular dynamics trajectories. Use for trajectory loading, RMSD/RMSF calculations, distance/angle/dihedral analysis, atom selections, hydrogen bonds, solvent accessible surface area, protein structure analysis, membrane analysis, and integration with Biopython. Essential for MD simulation analysis.

General

What this skill does


# MDAnalysis - Molecular Dynamics Analysis

Python library for reading, writing, and analyzing molecular dynamics trajectories and structural files.

## When to Use

- Loading MD trajectories (DCD, XTC, TRR, NetCDF, etc.)
- RMSD and RMSF calculations
- Distance, angle, and dihedral analysis
- Atom selections (VMD-like syntax)
- Hydrogen bond analysis
- Solvent Accessible Surface Area (SASA)
- Protein secondary structure analysis
- Membrane system analysis
- Water/ion distribution analysis
- Trajectory alignment and fitting
- Custom trajectory analysis
- Converting between file formats

## Reference Documentation

**Official docs**: https://www.mdanalysis.org/docs/  
**Search patterns**: `MDAnalysis.Universe`, `MDAnalysis.analysis.rms`, `MDAnalysis.analysis.distances`

## Core Principles

### Use MDAnalysis For

| Task | Module | Example |
|------|--------|---------|
| Load trajectory | `Universe` | `Universe(topology, trajectory)` |
| RMSD calculation | `analysis.rms` | `RMSD(mobile, ref)` |
| Atom selection | `select_atoms` | `u.select_atoms('protein')` |
| Distance analysis | `analysis.distances` | `distance_array(pos1, pos2)` |
| H-bond analysis | `analysis.hbonds` | `HydrogenBondAnalysis()` |
| SASA calculation | `analysis.sasa` | `SASAnalysis()` |
| Contacts analysis | `analysis.contacts` | `Contacts()` |
| Trajectory writing | `Writer` | `with Writer() as W` |

### Do NOT Use For

- Running MD simulations (use GROMACS, AMBER, NAMD)
- Force field calculations (use OpenMM, MDTraj)
- Quantum chemistry (use PySCF, Qiskit)
- Protein structure prediction (use AlphaFold, RosettaFold)
- Initial structure building (use Biopython, PyMOL)

## Quick Reference

### Installation

```bash
# pip
pip install MDAnalysis

# With additional analysis modules
pip install MDAnalysis[analysis]

# conda
conda install -c conda-forge mdanalysis

# Development version
pip install git+https://github.com/MDAnalysis/mdanalysis.git
```

### Standard Imports

```python
# Core imports
import MDAnalysis as mda
from MDAnalysis import Universe
from MDAnalysis.analysis import rms, align, distances

# Common analysis modules
from MDAnalysis.analysis.rms import RMSD, RMSF
from MDAnalysis.analysis.distances import distance_array
from MDAnalysis.analysis.hydrogenbonds.hbond_analysis import HydrogenBondAnalysis
from MDAnalysis.analysis.dihedrals import Dihedral

# Utilities
import numpy as np
import matplotlib.pyplot as plt
```

### Basic Pattern - Load and Analyze

```python
import MDAnalysis as mda
from MDAnalysis.analysis.rms import RMSD

# Load trajectory
u = mda.Universe('topology.pdb', 'trajectory.dcd')

# Select atoms
protein = u.select_atoms('protein')
ca_atoms = u.select_atoms('protein and name CA')

# Calculate RMSD
rmsd_analysis = RMSD(protein, protein, select='backbone')
rmsd_analysis.run()

# Access results
rmsd = rmsd_analysis.results.rmsd
print(f"RMSD over time: {rmsd[:, 2]}")  # Column 2 is RMSD
```

### Basic Pattern - Atom Selection

```python
import MDAnalysis as mda

u = mda.Universe('structure.pdb')

# Various selections (VMD-like syntax)
protein = u.select_atoms('protein')
backbone = u.select_atoms('backbone')
ca = u.select_atoms('name CA')
resid_10 = u.select_atoms('resid 10')
within_5A = u.select_atoms('around 5 resid 10')
water = u.select_atoms('resname WAT or resname HOH')

print(f"Number of protein atoms: {len(protein)}")
print(f"Number of CA atoms: {len(ca)}")
```

## Critical Rules

### ✅ DO

- **Close trajectory files** - Use context managers or close explicitly
- **Use atom selections efficiently** - Cache selections for reuse
- **Check trajectory length** - Verify n_frames before analysis
- **Use vectorized operations** - Leverage NumPy for speed
- **Align trajectories** - Align before RMSD calculations
- **Handle periodic boundaries** - Use PBC-aware distance calculations
- **Validate atom groups** - Check empty selections
- **Use appropriate frames** - Slice trajectories if needed
- **Save intermediate results** - Don't recompute expensive calculations
- **Check units** - MDAnalysis uses Angstroms and picoseconds

### ❌ DON'T

- **Load entire trajectory in memory** - Stream through frames
- **Ignore PBC** - Always consider periodic boundary conditions
- **Forget to align** - RMSD without alignment is meaningless
- **Use wrong atom names** - Check topology for correct names
- **Mix coordinate systems** - Be consistent with units
- **Ignore missing atoms** - Handle incomplete residues
- **Recompute unnecessarily** - Cache expensive calculations
- **Use string selections in loops** - Parse once, reuse
- **Forget to unwrap coordinates** - Handle molecules split by PBC
- **Ignore memory limits** - Process large trajectories in chunks

## Anti-Patterns (NEVER)

```python
import MDAnalysis as mda
import numpy as np

# ❌ BAD: Loading entire trajectory in memory
u = mda.Universe('top.pdb', 'traj.dcd')
all_coords = []
for ts in u.trajectory:
    all_coords.append(u.atoms.positions.copy())
all_coords = np.array(all_coords)  # Huge memory usage!

# ✅ GOOD: Process frame by frame
u = mda.Universe('top.pdb', 'traj.dcd')
for ts in u.trajectory:
    # Process current frame
    coords = u.atoms.positions
    # Do analysis...
    # Move to next frame automatically

# ❌ BAD: RMSD without alignment
rmsd_values = []
for ts in u.trajectory:
    rmsd = rms.rmsd(mobile.positions, reference.positions)
    rmsd_values.append(rmsd)  # Wrong! Not aligned!

# ✅ GOOD: Align before RMSD
from MDAnalysis.analysis.rms import RMSD
R = RMSD(mobile, reference, select='backbone')
R.run()
rmsd_values = R.results.rmsd[:, 2]

# ❌ BAD: Creating selection in loop
for ts in u.trajectory:
    ca = u.select_atoms('name CA')  # Parsed every frame!
    # Do something with ca

# ✅ GOOD: Create selection once
ca = u.select_atoms('name CA')
for ts in u.trajectory:
    # Use ca (automatically updated each frame)
    positions = ca.positions

# ❌ BAD: Ignoring periodic boundaries
distance = np.linalg.norm(atom1.position - atom2.position)

# ✅ GOOD: PBC-aware distance
from MDAnalysis.lib.distances import distance_array
dist = distance_array(
    atom1.position[np.newaxis, :],
    atom2.position[np.newaxis, :],
    box=u.dimensions
)[0, 0]

# ❌ BAD: Not checking for empty selections
selection = u.select_atoms('resname XYZ')
# Continue without checking if selection is empty!
avg_pos = selection.center_of_mass()  # May crash!

# ✅ GOOD: Validate selections
selection = u.select_atoms('resname XYZ')
if len(selection) == 0:
    print("Warning: No atoms found matching selection")
else:
    avg_pos = selection.center_of_mass()
```

## Loading Trajectories (Universe)

### Basic Universe Creation

```python
import MDAnalysis as mda

# Single structure file
u = mda.Universe('protein.pdb')

# Topology + trajectory
u = mda.Universe('topology.pdb', 'trajectory.dcd')

# Multiple trajectories (concatenated)
u = mda.Universe('top.pdb', 'traj1.dcd', 'traj2.dcd', 'traj3.dcd')

# Different formats
u = mda.Universe('system.gro', 'traj.xtc')  # GROMACS
u = mda.Universe('system.psf', 'traj.dcd')  # CHARMM/NAMD
u = mda.Universe('system.prmtop', 'traj.nc')  # AMBER

# From memory (numpy arrays)
coords = np.random.rand(100, 3)  # 100 atoms, xyz
u = mda.Universe.empty(100, trajectory=True)
u.atoms.positions = coords

print(f"Number of atoms: {len(u.atoms)}")
print(f"Number of frames: {len(u.trajectory)}")
print(f"Total time: {u.trajectory.totaltime} ps")
```

### Trajectory Information

```python
import MDAnalysis as mda

u = mda.Universe('topology.pdb', 'trajectory.dcd')

# Trajectory properties
traj = u.trajectory
print(f"Number of frames: {traj.n_frames}")
print(f"Time step: {traj.dt} ps")
print(f"Total time: {traj.totaltime} ps")

# Current frame info
print(f"Current frame: {traj.frame}")
print(f"Current time: {traj.time} ps")
print(f"Box dimensions: {u.dimensions}")  # [a, b, c, alpha, beta, gamma]

# Iterate through frames
for i, ts in enumerate(u.trajectory):
    if i >= 5:
        break
    print(f"Frame {ts.frame}: time

Files: 1

Size: 53.5 KB

Complexity: 41/100

Category: General

Source: https://github.com/tondevrel/scientific-agent-skills/tree/main/skills/mdanalysis

Related in General

modeling-omnistudio-epc-catalog

Included

Salesforce Industries CME EPC product-modeling skill for Product2-based catalog creation. Use when creating EPC products, configuring product attributes, building offer bundles with Product Child Items, or reviewing EPC DataPack JSON metadata for product catalog changes. TRIGGER when: user creates or updates Product2 EPC records, AttributeAssignment payloads, AttributeMetadata/AttributeDefaultValues, Offer bundles, or ProductChildItem relationships. DO NOT TRIGGER when: designing OmniScripts/FlexCards/Integration Procedures (use building-omnistudio-omniscript, building-omnistudio-flexcard, or building-omnistudio-integration-procedure), implementing Apex business logic (use generating-apex), or troubleshooting deployment pipelines (use deploying-metadata).

Generalscripts

relationship-science-coach

Included

Use this skill for direct, practical adult relationship coaching: couples conflict, repair, trust, marriage, dating, flirting, attachment patterns, emotional connection, sex, desire differences, eroticism, kink negotiation, affection, love languages, breakups, and long-term passion. Draw on Gottman, EFT and Hold Me Tight, attachment science, modern sex research, Perel, Nagoski, Kerner, Schnarch, Love and Stosny, and flexible love-language tools. Be concrete and low-hedge. Redirect only for imminent danger, abuse, coercive control, minors, non-consent, self-harm, stalking, or medical/legal/psychiatric decisions.

Generalscripts

building-sf-integrations

Included

Salesforce integration architecture and runtime plumbing with 120-point scoring. Use this skill to set up Named Credentials, External Credentials, External Services, REST/SOAP callout patterns, Platform Events, and Change Data Capture. TRIGGER when: user sets up Named Credentials, External Services, REST/SOAP callouts, Platform Events, CDC, or touches .namedCredential-meta.xml files. DO NOT TRIGGER when: Connected App/OAuth config (use configuring-connected-apps), Apex-only logic (use generating-apex), or data import/export (use handling-sf-data).

Generalscripts

venue-templates

Included

Access comprehensive LaTeX templates, formatting requirements, and submission guidelines for major scientific publication venues (Nature, Science, PLOS, IEEE, ACM), academic conferences (NeurIPS, ICML, CVPR, CHI), research posters, and grant proposals (NSF, NIH, DOE, DARPA). This skill should be used when preparing manuscripts for journal submission, conference papers, research posters, or grant proposals and need venue-specific formatting requirements and templates.

Generalscripts

let-fate-decide

Included

Draws the 12 Houses of the Zodiac Tarot spread to inject entropy into planning when prompts are vague, ambiguous, or casually delegated. Interprets the spread to guide next steps. Use when the user says 'let fate decide', 'YOLO', 'whatever', 'idk', or other nonchalant phrases, makes Yu-Gi-Oh references, or when you are about to arbitrarily pick between multiple reasonable approaches. Prefer over ask-questions-if-underspecified when the user's tone is casual or playful rather than precision-seeking.

Generalscripts

net-ops

Included

Cross-platform network troubleshooting (Windows, macOS, Linux) via local or remote shell. Use for: DNS broken, can't resolve hostnames, nslookup/dig works but apps fail, NRPT, WFP, scutil, /etc/resolver, systemd-resolved, /etc/resolv.conf, NetworkManager, VPN DNS leak residue (ProtonVPN/Mullvad/WireGuard/AnyConnect), AV/firewall blocking DNS or DoH, Tailscale DNS interaction, intermittent connectivity, remote diagnostics over SSH.

Generalscripts