end-to-end-study

Included with Lifetime

$97 forever

End-to-end data-first research pipeline — find an underutilised public dataset, pick the publishable claim, target a journal, ship LaTeX manuscript + tagged GitHub release. Use for computational-bio / bioinformatics / clinical-genomics on TCGA, GEO, GDC, cBioPortal, or when user says "end-to-end paper" / "what's publishable here".

Generalscriptsassets

What this skill does


# end-to-end-study

Chain: **best data for novelty** -> **low-hanging fruit** -> **most-likely paper (venue + story)** -> **author instructions** -> **method under rigor checklist** -> **reader/reviewer-oriented writing** -> **reviewer cycle** -> **private repo + tagged release**.

Seven phases. Each phase links to a dedicated reference file.

## Phase 0 - Orientation (30 s)

Confirm with the user: domain (oncology? immunology? neurology?), topic vagueness (is a topic given, or should the skill scan datasets first?), claim type preference (method / translational finding / benchmark). Decline phases outside scope (wet-lab validation, actual journal submission).

## Phase 1 - Best data for novelty (data-first scan)

Novelty usually lives in **under-mined assets of recently released large cohorts**, not in re-analyses of classic datasets. Scan for (1) recent large-cohort releases with a secondary modality that has not been systematically exploited, (2) paired-modality data where one modality is under-used, (3) public drug-response or perturbation screens paired with rich clinical metadata.

See [references/data-first-novelty.md](references/data-first-novelty.md) for the scan heuristics, a catalog of currently-underutilised public datasets, and worked example (BeatAML ex-vivo drug-sensitivity table in Tyner 2018 supplementary).

**Exit criterion**: a primary dataset is named, its under-mined asset is identified, and a short list of 2-3 candidate claims is drafted.

## Phase 2 - Low-hanging fruit selection

Within the chosen dataset, pick the claim with the best effort-to-impact ratio. See [references/low-hanging-fruit.md](references/low-hanging-fruit.md) for the selection matrix (combination-gap claim vs scale-gap vs rigor-gap vs orthogonal-outcome claim), with expected-IF mapping for each type.

Use WebSearch + bioRxiv / PubMed MCP to verify the chosen claim is actually un-done. Heuristics for keyword laddering and gap verification are in [references/novelty-search.md](references/novelty-search.md).

**Exit criterion**: one sentence of the form "No prior work does X in Y; this paper shows Z (effect-size guess)", validated against literature.

## Phase 3 - Target journal + author instructions

Pick the journal where this specific claim has the highest acceptance probability, not the highest IF in absolute terms. See [references/journal-targets.md](references/journal-targets.md) for the tiered catalog with acceptance-probability heuristics. Then immediately **fetch the target journal's author instructions** with WebFetch and distil them into the project-local `manuscript/JOURNAL.md`.

See [references/author-instructions.md](references/author-instructions.md) for the extraction checklist (word limits, figure count, reference style, reporting-guideline requirements, data-availability policy, preprint policy) and canonical author-instructions URLs for Nature Commun, Leukemia, Genome Medicine, Cell Reports Medicine, Blood Cancer Journal, Briefings in Bioinformatics.

Copy the LaTeX skeleton from `assets/latex/` and edit the bibliography style, section order, and figure-count cap to match the target journal.

**Exit criterion**: journal named, author-instructions distilled into `manuscript/JOURNAL.md`, LaTeX template adjusted, tex compiles with placeholder title.

## Phase 4 - Project setup + preregistration + method design

Scaffold with `scripts/init_project.py <project-dir>` (creates `data/raw`, `data/processed`, `data/results`, `analysis/`, `manuscript/`, `figures/`, `docs/`, `.github/workflows/`, `.gitignore`, `LICENSE`, `README.md`, `pyproject.toml`, and a `docs/prereg.md` stub). The script refuses to overwrite existing critical files without `--force` (which takes timestamped backups). Install Python dependencies with `uv add`.

**Commit `docs/prereg.md` before touching any outcome-related analysis.** The preregistration commit is the single most important integrity artefact; skipping it turns any later outcome pivot into HARKing. See [references/preregistration-and-integrity.md](references/preregistration-and-integrity.md) for the required contents and the rules that make the orthogonal-outcome pivot legitimate.

Design the method **before** looking at results. See [references/method-design.md](references/method-design.md) for the rigor checklist (leakage-free feature selection, stability / parameter sweep, permutation null, proportional-hazards tests, nested C-index, calibration + DCA / NRI if clinical, bootstrap CIs, multiple-testing correction). The target journal's author instructions drive the checklist priority - clinical journals weight calibration + DCA; methods journals weight stability + null models.

Common open-dataset URLs and download recipes live in [references/open-datasets.md](references/open-datasets.md); access codes (Open / Gated / Mixed) in [references/data-first-novelty.md](references/data-first-novelty.md) tell you which raw artefacts can be redistributed in the final release.

**Exit criterion**: `docs/prereg.md` committed, data downloaded, analysis scripts stubbed, `uv run python analysis/01_prepare_data.py` succeeds.

## Phase 5 - Analysis, figures, manuscript with journal-matched writing

Iterate analysis -> figures -> LaTeX. Structural conventions in [references/manuscript-structure.md](references/manuscript-structure.md).

Writing style is journal-specific. See [references/reader-reviewer-writing.md](references/reader-reviewer-writing.md) for the explicit reader and reviewer profiles of Nature Commun, Leukemia, Genome Medicine, BCJ, Cell Reports Medicine, Briefings in Bioinformatics, and the abstract-headline + discussion-caveat patterns that work at each. A Leukemia reader expects the clinical hook in the first sentence; a Briefings reader expects the methods gap; a Nature Commun reader expects the broad-interest statement followed by the quantitative headline.

Compile with `cd manuscript && make`.

**Exit criterion**: full PDF compiles; abstract-headline, discussion-caveats, and data/code availability sections match the style targeted at `manuscript/JOURNAL.md`.

## Phase 6 - Reviewer cycle

Dispatch four adversarial reviewers in parallel (methods, clinical, biostatistics, target-journal editor). Iterate until unanimous accept. Pattern and prompt templates in [references/reviewer-cycle.md](references/reviewer-cycle.md). The target-journal editor reviewer is instantiated with that journal's historical concerns - loaded from the author-instructions file produced in Phase 3.

## Phase 7 - Repo and tagged release

Create private GitHub repo, wire CI, push tag, release PDF + .tex + source bundle. Exact commands + CI workflow template in [references/release-workflow.md](references/release-workflow.md). CI template at `assets/github/release.yml`; LaTeX helpers at `assets/latex/Makefile` + `latexmkrc`. Use `scripts/new_release.sh` for a preflight-checked release (rejects dirty git tree, missing `gh` auth, stale placeholder DOIs, absent figures).

Release naming: `v<MAJOR>.<MINOR>.<PATCH>` where MAJOR bumps reflect scientific pivots, MINOR bumps reflect reviewer-round revisions, PATCH bumps reflect typos and terminology.

## Phase 8 - Submission support (optional but expected at IF 10+)

The skill's release workflow ends at a tagged preprint-ready PDF. Actual submission happens off-skill, but the supporting artefacts are templated here:

- Preprint posting (bioRxiv / medRxiv / arXiv) + Zenodo DOI minting: [references/preprint-and-dois.md](references/preprint-and-dois.md). Ship the preprint before the journal submission so the cover letter can cite it.
- Cover letter + point-by-point rebuttal templates: [references/cover-letter-and-rebuttal.md](references/cover-letter-and-rebuttal.md). Rebuttals are the highest-leverage artefact in a revision cycle.
- Submission-portal quirks (Editorial Manager, ScholarOne, Snapsubmit, figure-format, ORCID, supplementary material): [references/submission-portals.md](references/submission-portal

Files: 25

Size: 106.7 KB

Complexity: 90/100

Category: General

Source: https://github.com/htlin222/dotfiles/tree/main/claude.symlink/skills/end-to-end-study

Related in General

modeling-omnistudio-epc-catalog

Included

Salesforce Industries CME EPC product-modeling skill for Product2-based catalog creation. Use when creating EPC products, configuring product attributes, building offer bundles with Product Child Items, or reviewing EPC DataPack JSON metadata for product catalog changes. TRIGGER when: user creates or updates Product2 EPC records, AttributeAssignment payloads, AttributeMetadata/AttributeDefaultValues, Offer bundles, or ProductChildItem relationships. DO NOT TRIGGER when: designing OmniScripts/FlexCards/Integration Procedures (use building-omnistudio-omniscript, building-omnistudio-flexcard, or building-omnistudio-integration-procedure), implementing Apex business logic (use generating-apex), or troubleshooting deployment pipelines (use deploying-metadata).

Generalscripts

relationship-science-coach

Included

Use this skill for direct, practical adult relationship coaching: couples conflict, repair, trust, marriage, dating, flirting, attachment patterns, emotional connection, sex, desire differences, eroticism, kink negotiation, affection, love languages, breakups, and long-term passion. Draw on Gottman, EFT and Hold Me Tight, attachment science, modern sex research, Perel, Nagoski, Kerner, Schnarch, Love and Stosny, and flexible love-language tools. Be concrete and low-hedge. Redirect only for imminent danger, abuse, coercive control, minors, non-consent, self-harm, stalking, or medical/legal/psychiatric decisions.

Generalscripts

building-sf-integrations

Included

Salesforce integration architecture and runtime plumbing with 120-point scoring. Use this skill to set up Named Credentials, External Credentials, External Services, REST/SOAP callout patterns, Platform Events, and Change Data Capture. TRIGGER when: user sets up Named Credentials, External Services, REST/SOAP callouts, Platform Events, CDC, or touches .namedCredential-meta.xml files. DO NOT TRIGGER when: Connected App/OAuth config (use configuring-connected-apps), Apex-only logic (use generating-apex), or data import/export (use handling-sf-data).

Generalscripts

venue-templates

Included

Access comprehensive LaTeX templates, formatting requirements, and submission guidelines for major scientific publication venues (Nature, Science, PLOS, IEEE, ACM), academic conferences (NeurIPS, ICML, CVPR, CHI), research posters, and grant proposals (NSF, NIH, DOE, DARPA). This skill should be used when preparing manuscripts for journal submission, conference papers, research posters, or grant proposals and need venue-specific formatting requirements and templates.

Generalscripts

let-fate-decide

Included

Draws the 12 Houses of the Zodiac Tarot spread to inject entropy into planning when prompts are vague, ambiguous, or casually delegated. Interprets the spread to guide next steps. Use when the user says 'let fate decide', 'YOLO', 'whatever', 'idk', or other nonchalant phrases, makes Yu-Gi-Oh references, or when you are about to arbitrarily pick between multiple reasonable approaches. Prefer over ask-questions-if-underspecified when the user's tone is casual or playful rather than precision-seeking.

Generalscripts

net-ops

Included

Cross-platform network troubleshooting (Windows, macOS, Linux) via local or remote shell. Use for: DNS broken, can't resolve hostnames, nslookup/dig works but apps fail, NRPT, WFP, scutil, /etc/resolver, systemd-resolved, /etc/resolv.conf, NetworkManager, VPN DNS leak residue (ProtonVPN/Mullvad/WireGuard/AnyConnect), AV/firewall blocking DNS or DoH, Tailscale DNS interaction, intermittent connectivity, remote diagnostics over SSH.

Generalscripts