biorxiv-database
Efficient database search tool for bioRxiv preprint server. Use this skill when searching for life sciences preprints by keywords, authors, date ranges, or categories, retrieving paper metadata, downloading PDFs, or conducting literature reviews.
What this skill does
# bioRxiv Database
## Overview
This skill provides efficient Python-based tools for searching and retrieving preprints from the bioRxiv database. It enables comprehensive searches by keywords, authors, date ranges, and categories, returning structured JSON metadata that includes titles, abstracts, DOIs, and citation information. The skill also supports PDF downloads for full-text analysis.
## When to Use This Skill
Use this skill when:
- Searching for recent preprints in specific research areas
- Tracking publications by particular authors
- Conducting systematic literature reviews
- Analyzing research trends over time periods
- Retrieving metadata for citation management
- Downloading preprint PDFs for analysis
- Filtering papers by bioRxiv subject categories
## Core Search Capabilities
### 1. Keyword Search
Search for preprints containing specific keywords in titles, abstracts, or author lists.
**Basic Usage:**
```python
python scripts/biorxiv_search.py \
--keywords "CRISPR" "gene editing" \
--start-date 2024-01-01 \
--end-date 2024-12-31 \
--output results.json
```
**With Category Filter:**
```python
python scripts/biorxiv_search.py \
--keywords "neural networks" "deep learning" \
--days-back 180 \
--category neuroscience \
--output recent_neuroscience.json
```
**Search Fields:**
By default, keywords are searched in both title and abstract. Customize with `--search-fields`:
```python
python scripts/biorxiv_search.py \
--keywords "AlphaFold" \
--search-fields title \
--days-back 365
```
### 2. Author Search
Find all papers by a specific author within a date range.
**Basic Usage:**
```python
python scripts/biorxiv_search.py \
--author "Smith" \
--start-date 2023-01-01 \
--end-date 2024-12-31 \
--output smith_papers.json
```
**Recent Publications:**
```python
# Last year by default if no dates specified
python scripts/biorxiv_search.py \
--author "Johnson" \
--output johnson_recent.json
```
### 3. Date Range Search
Retrieve all preprints posted within a specific date range.
**Basic Usage:**
```python
python scripts/biorxiv_search.py \
--start-date 2024-01-01 \
--end-date 2024-01-31 \
--output january_2024.json
```
**With Category Filter:**
```python
python scripts/biorxiv_search.py \
--start-date 2024-06-01 \
--end-date 2024-06-30 \
--category genomics \
--output genomics_june.json
```
**Days Back Shortcut:**
```python
# Last 30 days
python scripts/biorxiv_search.py \
--days-back 30 \
--output last_month.json
```
### 4. Paper Details by DOI
Retrieve detailed metadata for a specific preprint.
**Basic Usage:**
```python
python scripts/biorxiv_search.py \
--doi "10.1101/2024.01.15.123456" \
--output paper_details.json
```
**Full DOI URLs Accepted:**
```python
python scripts/biorxiv_search.py \
--doi "https://doi.org/10.1101/2024.01.15.123456"
```
### 5. PDF Downloads
Download the full-text PDF of any preprint.
**Basic Usage:**
```python
python scripts/biorxiv_search.py \
--doi "10.1101/2024.01.15.123456" \
--download-pdf paper.pdf
```
**Batch Processing:**
For multiple PDFs, extract DOIs from a search result JSON and download each paper:
```python
import json
from biorxiv_search import BioRxivSearcher
# Load search results
with open('results.json') as f:
data = json.load(f)
searcher = BioRxivSearcher(verbose=True)
# Download each paper
for i, paper in enumerate(data['results'][:10]): # First 10 papers
doi = paper['doi']
searcher.download_pdf(doi, f"papers/paper_{i+1}.pdf")
```
## Valid Categories
Filter searches by bioRxiv subject categories:
- `animal-behavior-and-cognition`
- `biochemistry`
- `bioengineering`
- `bioinformatics`
- `biophysics`
- `cancer-biology`
- `cell-biology`
- `clinical-trials`
- `developmental-biology`
- `ecology`
- `epidemiology`
- `evolutionary-biology`
- `genetics`
- `genomics`
- `immunology`
- `microbiology`
- `molecular-biology`
- `neuroscience`
- `paleontology`
- `pathology`
- `pharmacology-and-toxicology`
- `physiology`
- `plant-biology`
- `scientific-communication-and-education`
- `synthetic-biology`
- `systems-biology`
- `zoology`
## Output Format
All searches return structured JSON with the following format:
```json
{
"query": {
"keywords": ["CRISPR"],
"start_date": "2024-01-01",
"end_date": "2024-12-31",
"category": "genomics"
},
"result_count": 42,
"results": [
{
"doi": "10.1101/2024.01.15.123456",
"title": "Paper Title Here",
"authors": "Smith J, Doe J, Johnson A",
"author_corresponding": "Smith J",
"author_corresponding_institution": "University Example",
"date": "2024-01-15",
"version": "1",
"type": "new results",
"license": "cc_by",
"category": "genomics",
"abstract": "Full abstract text...",
"pdf_url": "https://www.biorxiv.org/content/10.1101/2024.01.15.123456v1.full.pdf",
"html_url": "https://www.biorxiv.org/content/10.1101/2024.01.15.123456v1",
"jatsxml": "https://www.biorxiv.org/content/...",
"published": ""
}
]
}
```
## Common Usage Patterns
### Literature Review Workflow
1. **Broad keyword search:**
```python
python scripts/biorxiv_search.py \
--keywords "organoids" "tissue engineering" \
--start-date 2023-01-01 \
--end-date 2024-12-31 \
--category bioengineering \
--output organoid_papers.json
```
2. **Extract and review results:**
```python
import json
with open('organoid_papers.json') as f:
data = json.load(f)
print(f"Found {data['result_count']} papers")
for paper in data['results'][:5]:
print(f"\nTitle: {paper['title']}")
print(f"Authors: {paper['authors']}")
print(f"Date: {paper['date']}")
print(f"DOI: {paper['doi']}")
```
3. **Download selected papers:**
```python
from biorxiv_search import BioRxivSearcher
searcher = BioRxivSearcher()
selected_dois = ["10.1101/2024.01.15.123456", "10.1101/2024.02.20.789012"]
for doi in selected_dois:
filename = doi.replace("/", "_").replace(".", "_") + ".pdf"
searcher.download_pdf(doi, f"papers/{filename}")
```
### Trend Analysis
Track research trends by analyzing publication frequencies over time:
```python
python scripts/biorxiv_search.py \
--keywords "machine learning" \
--start-date 2020-01-01 \
--end-date 2024-12-31 \
--category bioinformatics \
--output ml_trends.json
```
Then analyze the temporal distribution in the results.
### Author Tracking
Monitor specific researchers' preprints:
```python
# Track multiple authors
authors = ["Smith", "Johnson", "Williams"]
for author in authors:
python scripts/biorxiv_search.py \
--author "{author}" \
--days-back 365 \
--output "{author}_papers.json"
```
## Python API Usage
For more complex workflows, import and use the `BioRxivSearcher` class directly:
```python
from scripts.biorxiv_search import BioRxivSearcher
# Initialize
searcher = BioRxivSearcher(verbose=True)
# Multiple search operations
keywords_papers = searcher.search_by_keywords(
keywords=["CRISPR", "gene editing"],
start_date="2024-01-01",
end_date="2024-12-31",
category="genomics"
)
author_papers = searcher.search_by_author(
author_name="Smith",
start_date="2023-01-01",
end_date="2024-12-31"
)
# Get specific paper details
paper = searcher.get_paper_details("10.1101/2024.01.15.123456")
# Download PDF
success = searcher.download_pdf(
doi="10.1101/2024.01.15.123456",
output_path="paper.pdf"
)
# Format results consistently
formatted = searcher.format_result(paper, include_abstract=True)
```
## Best Practices
1. **Use appropriate date ranges**: Smaller date ranges return faster. For keyword searches over long periods, consider splitting into multiple queries.
2. **Filter by category**: When possible, use `--category` to reduce data transfer and improve search precision.
3. **Respect rate limits**: The script includes automatic delays (0.5s between requests). For large-scale data collection, add addRelated in General
modeling-omnistudio-epc-catalog
IncludedSalesforce Industries CME EPC product-modeling skill for Product2-based catalog creation. Use when creating EPC products, configuring product attributes, building offer bundles with Product Child Items, or reviewing EPC DataPack JSON metadata for product catalog changes. TRIGGER when: user creates or updates Product2 EPC records, AttributeAssignment payloads, AttributeMetadata/AttributeDefaultValues, Offer bundles, or ProductChildItem relationships. DO NOT TRIGGER when: designing OmniScripts/FlexCards/Integration Procedures (use building-omnistudio-omniscript, building-omnistudio-flexcard, or building-omnistudio-integration-procedure), implementing Apex business logic (use generating-apex), or troubleshooting deployment pipelines (use deploying-metadata).
relationship-science-coach
IncludedUse this skill for direct, practical adult relationship coaching: couples conflict, repair, trust, marriage, dating, flirting, attachment patterns, emotional connection, sex, desire differences, eroticism, kink negotiation, affection, love languages, breakups, and long-term passion. Draw on Gottman, EFT and Hold Me Tight, attachment science, modern sex research, Perel, Nagoski, Kerner, Schnarch, Love and Stosny, and flexible love-language tools. Be concrete and low-hedge. Redirect only for imminent danger, abuse, coercive control, minors, non-consent, self-harm, stalking, or medical/legal/psychiatric decisions.
building-sf-integrations
IncludedSalesforce integration architecture and runtime plumbing with 120-point scoring. Use this skill to set up Named Credentials, External Credentials, External Services, REST/SOAP callout patterns, Platform Events, and Change Data Capture. TRIGGER when: user sets up Named Credentials, External Services, REST/SOAP callouts, Platform Events, CDC, or touches .namedCredential-meta.xml files. DO NOT TRIGGER when: Connected App/OAuth config (use configuring-connected-apps), Apex-only logic (use generating-apex), or data import/export (use handling-sf-data).
venue-templates
IncludedAccess comprehensive LaTeX templates, formatting requirements, and submission guidelines for major scientific publication venues (Nature, Science, PLOS, IEEE, ACM), academic conferences (NeurIPS, ICML, CVPR, CHI), research posters, and grant proposals (NSF, NIH, DOE, DARPA). This skill should be used when preparing manuscripts for journal submission, conference papers, research posters, or grant proposals and need venue-specific formatting requirements and templates.
let-fate-decide
IncludedDraws the 12 Houses of the Zodiac Tarot spread to inject entropy into planning when prompts are vague, ambiguous, or casually delegated. Interprets the spread to guide next steps. Use when the user says 'let fate decide', 'YOLO', 'whatever', 'idk', or other nonchalant phrases, makes Yu-Gi-Oh references, or when you are about to arbitrarily pick between multiple reasonable approaches. Prefer over ask-questions-if-underspecified when the user's tone is casual or playful rather than precision-seeking.
net-ops
IncludedCross-platform network troubleshooting (Windows, macOS, Linux) via local or remote shell. Use for: DNS broken, can't resolve hostnames, nslookup/dig works but apps fail, NRPT, WFP, scutil, /etc/resolver, systemd-resolved, /etc/resolv.conf, NetworkManager, VPN DNS leak residue (ProtonVPN/Mullvad/WireGuard/AnyConnect), AV/firewall blocking DNS or DoH, Tailscale DNS interaction, intermittent connectivity, remote diagnostics over SSH.