harvard-library-catalog
Search Harvard Library's 13M+ bibliographic records via LibraryCloud and retrieve MARC/MODS data via PRESTO. Use this skill whenever the user wants to look up books, manuscripts, finding aids, or other items in Harvard's library catalog, verify bibliographic information (title, author, ISBN, publication date), find digital collections, or retrieve detailed catalog records. Also triggers when a user extracts a book title from a document and wants to find its full bibliographic metadata.
What this skill does
# Harvard Library API Skill
Search and retrieve bibliographic records from Harvard Library's catalog of 13M+ items.
## Critical: Things Claude Won't Know Without This Skill
### LibraryCloud field-based search uses query parameters, NOT Solr syntax
The Item API uses **field names as query parameters** — not `q=field:value` Solr syntax.
```
CORRECT: https://api.lib.harvard.edu/v2/items.json?title=hamlet&name=shakespeare
WRONG: https://api.lib.harvard.edu/v2/items?q=title:hamlet
```
The `q=` parameter is for **keyword search across all fields**. Field-specific search uses dedicated parameters like `title=`, `name=`, `subject=`, `identifier=`, etc.
### JSON requires `.json` in the URL path
Responses are XML by default. To get JSON, append `.json` before the query string:
```
JSON: https://api.lib.harvard.edu/v2/items.json?title=hamlet
Dublin Core: https://api.lib.harvard.edu/v2/items.dc.json?title=hamlet
Default XML: https://api.lib.harvard.edu/v2/items?title=hamlet
```
### PRESTO is for direct record lookup by HOLLIS ID
PRESTO returns raw MARC, MODS, or Dublin Core for a single record by its HOLLIS number. It complements LibraryCloud when you need the original catalog record:
```
MARC: https://webservices.lib.harvard.edu/rest/marc/hollis/{HOLLIS_ID}
MODS: https://webservices.lib.harvard.edu/rest/mods/hollis/{HOLLIS_ID}
DC: https://webservices.lib.harvard.edu/rest/dc/hollis/{HOLLIS_ID}
```
PRESTO returns XML only and does not support JSON serialization. ISBN/barcode lookups may not work on all records.
### User-Agent header is required
LibraryCloud returns 403 without a User-Agent header. Always include one:
```bash
curl -H 'User-Agent: MyApp/1.0' 'https://api.lib.harvard.edu/v2/items.json?title=hamlet'
```
The Python script includes this automatically.
### Rate limit: max 1 request/second, 300 per 5 minutes
Exceeding this triggers a 5-minute lockout. The Python script handles this automatically.
## Choosing an Access Method
| Need | Method |
|------|--------|
| Search by title, author, subject, date | LibraryCloud Item API (field params) |
| Full-text keyword search | LibraryCloud Item API (`q=` param) |
| Look up by ISBN, LCCN, or other identifier | LibraryCloud `identifier=` or `q=` keyword |
| Browse digital collections | LibraryCloud `collectionTitle=` or Collections API |
| Get raw MARC record for a known HOLLIS ID | PRESTO `/rest/marc/hollis/{id}` |
| Faceted browsing (by language, date, genre) | LibraryCloud `facets=` parameter |
## Typical Workflow: Book Title to Full Bibliography
This is the primary use case — an LLM extracts a book title from a document and needs complete bibliographic data:
```python
from scripts.harvard_api import HarvardLibraryAPI
api = HarvardLibraryAPI()
# 1. Search by title (and optionally author)
results = api.search(title="The Great Gatsby", name="Fitzgerald")
# 2. Get the first match's summary
if results:
summary = api.summarize(results[0])
# → title, author, publisher, date, ISBN, subjects, language, physical description
# 3. For deeper data, get MARC via PRESTO
hollis_id = api.get_record_id(results[0])
if hollis_id:
marc = api.get_presto_record(hollis_id, format="mods")
```
## Key Search Fields
| Field | What it searches | Exact match? |
|-------|-----------------|-------------|
| `q` | All fields (keyword) | No |
| `title` | Title, subtitle, part name/number | Yes (`title_exact`) |
| `name` | All name fields (author, editor, etc.) | No |
| `subject` | All subject fields (topic, geographic, temporal) | Yes (`subject_exact`) |
| `identifier` | ISBN, LCCN, other system IDs | Yes |
| `languageCode` | ISO language code (e.g., `chi`, `eng`) | Yes |
| `dateIssued` | Publication date (YYYY) | Yes |
| `dates.start` / `dates.end` | Date range filter | — |
| `genre` | Genre/form (e.g., "Drawings", "Maps") | Yes (`genre_exact`) |
| `repository` | Harvard library name | Yes |
| `isOnline` | Has digital version (`true`/`false`) | — |
| `recordIdentifier` | HOLLIS/Alma record ID | Yes |
Combine fields freely: `?title=hamlet&name=shakespeare&languageCode=ger&dates.start=1900`
## Pagination
- `limit=N` (default 10, max 250)
- `start=N` for offset-based pagination (up to ~30K results)
- `cursor=*` then `cursor={nextCursor}` for large result sets (up to 100K)
## Facets
Add `facets=field1,field2` to get value counts. Useful fields: `name`, `subject`, `languageCode`, `genre`, `resourceType`, `repository`, `dateIssued`.
```
?title=china&facets=languageCode,genre
```
## Python Script
Use `scripts/harvard_api.py` for programmatic access (zero dependencies):
```python
from scripts.harvard_api import HarvardLibraryAPI
api = HarvardLibraryAPI()
# Keyword search
results = api.search(q="Chinese porcelain Ming dynasty")
# Field search
results = api.search(title="dream of the red chamber", languageCode="chi")
# With facets
results, facets = api.search_with_facets(subject="astronomy", facets=["genre", "dateIssued"])
# Pagination
all_results = api.search_all(title="peanuts", name="schulz", max_results=500)
# PRESTO lookup
marc_xml = api.get_presto_record("011557057", format="marc")
# Summarize a record
for r in results[:5]:
print(api.summarize(r))
```
## API Endpoints
| Endpoint | URL |
|----------|-----|
| LibraryCloud Items | `https://api.lib.harvard.edu/v2/items` |
| LibraryCloud Collections | `https://api.lib.harvard.edu/v2/collections` |
| PRESTO (MARC/MODS/DC) | `https://webservices.lib.harvard.edu/rest/{format}/hollis/{id}` |
## Related Skills
- **wikidata-search**: Cross-reference Harvard catalog entries with Wikidata for external identifiers (VIAF, LoC, etc.)
- **cbdb-api**: Look up authors of Chinese historical texts in CBDB for biographical context
## Resources
- `references/api_reference.md` — Complete field reference with all searchable fields, facets, and query examples
- `scripts/harvard_api.py` — Full-featured Python client with rate limiting, pagination, and record summarization
Related in General
modeling-omnistudio-epc-catalog
IncludedSalesforce Industries CME EPC product-modeling skill for Product2-based catalog creation. Use when creating EPC products, configuring product attributes, building offer bundles with Product Child Items, or reviewing EPC DataPack JSON metadata for product catalog changes. TRIGGER when: user creates or updates Product2 EPC records, AttributeAssignment payloads, AttributeMetadata/AttributeDefaultValues, Offer bundles, or ProductChildItem relationships. DO NOT TRIGGER when: designing OmniScripts/FlexCards/Integration Procedures (use building-omnistudio-omniscript, building-omnistudio-flexcard, or building-omnistudio-integration-procedure), implementing Apex business logic (use generating-apex), or troubleshooting deployment pipelines (use deploying-metadata).
relationship-science-coach
IncludedUse this skill for direct, practical adult relationship coaching: couples conflict, repair, trust, marriage, dating, flirting, attachment patterns, emotional connection, sex, desire differences, eroticism, kink negotiation, affection, love languages, breakups, and long-term passion. Draw on Gottman, EFT and Hold Me Tight, attachment science, modern sex research, Perel, Nagoski, Kerner, Schnarch, Love and Stosny, and flexible love-language tools. Be concrete and low-hedge. Redirect only for imminent danger, abuse, coercive control, minors, non-consent, self-harm, stalking, or medical/legal/psychiatric decisions.
building-sf-integrations
IncludedSalesforce integration architecture and runtime plumbing with 120-point scoring. Use this skill to set up Named Credentials, External Credentials, External Services, REST/SOAP callout patterns, Platform Events, and Change Data Capture. TRIGGER when: user sets up Named Credentials, External Services, REST/SOAP callouts, Platform Events, CDC, or touches .namedCredential-meta.xml files. DO NOT TRIGGER when: Connected App/OAuth config (use configuring-connected-apps), Apex-only logic (use generating-apex), or data import/export (use handling-sf-data).
venue-templates
IncludedAccess comprehensive LaTeX templates, formatting requirements, and submission guidelines for major scientific publication venues (Nature, Science, PLOS, IEEE, ACM), academic conferences (NeurIPS, ICML, CVPR, CHI), research posters, and grant proposals (NSF, NIH, DOE, DARPA). This skill should be used when preparing manuscripts for journal submission, conference papers, research posters, or grant proposals and need venue-specific formatting requirements and templates.
let-fate-decide
IncludedDraws the 12 Houses of the Zodiac Tarot spread to inject entropy into planning when prompts are vague, ambiguous, or casually delegated. Interprets the spread to guide next steps. Use when the user says 'let fate decide', 'YOLO', 'whatever', 'idk', or other nonchalant phrases, makes Yu-Gi-Oh references, or when you are about to arbitrarily pick between multiple reasonable approaches. Prefer over ask-questions-if-underspecified when the user's tone is casual or playful rather than precision-seeking.
net-ops
IncludedCross-platform network troubleshooting (Windows, macOS, Linux) via local or remote shell. Use for: DNS broken, can't resolve hostnames, nslookup/dig works but apps fail, NRPT, WFP, scutil, /etc/resolver, systemd-resolved, /etc/resolv.conf, NetworkManager, VPN DNS leak residue (ProtonVPN/Mullvad/WireGuard/AnyConnect), AV/firewall blocking DNS or DoH, Tailscale DNS interaction, intermittent connectivity, remote diagnostics over SSH.