Claude
Skills
Sign in
Back

apify-scrapers

Included with Lifetime
$97 forever

Social media and web scraping using Apify actors. Use this skill when scraping Twitter/X tweets, Reddit posts, LinkedIn posts, Instagram profiles/posts/reels, Facebook pages/posts/groups, TikTok videos, YouTube content, Google Maps businesses/reviews, contact enrichment (emails/phones from websites), or when auto-detecting URL type to scrape. Triggers on requests to scrape social media, get trending posts, extract business info, find contact details, or extract content from social URLs.

Ads & Marketingscripts

What this skill does


# Apify Scrapers

## Overview

Scrape content from major social platforms using Apify actors. Each platform has optimized settings for cost and quality.

## Quick Decision Tree

```
What do you want to scrape?
│
├── Social Media Posts
│   ├── Twitter/X → references/twitter.md
│   │   └── Script: scripts/scrape_twitter_ai_trends.py
│   │
│   ├── Reddit → references/reddit.md
│   │   └── Script: scripts/scrape_reddit_ai_tech.py
│   │
│   ├── LinkedIn → references/linkedin.md
│   │   └── Script: scripts/scrape_linkedin_posts.py
│   │
│   ├── Instagram → references/instagram.md
│   │   └── Script: scripts/scrape_instagram.py
│   │   └── Modes: profile, posts, hashtag, reels, comments
│   │
│   ├── Facebook → references/facebook.md
│   │   └── Script: scripts/scrape_facebook.py
│   │   └── Modes: page, posts, reviews, groups, marketplace
│   │
│   ├── TikTok → references/multi-platform.md
│   │   └── Script: scripts/scrape_multi_platform.py
│   │
│   └── YouTube → references/multi-platform.md
│       └── Script: scripts/scrape_multi_platform.py
│
├── Business/Places
│   ├── Google Maps businesses → references/google-maps.md
│   │   └── Script: scripts/scrape_google_maps.py
│   │   └── Modes: search, place, reviews
│   │
│   └── Contact info from websites → references/contact-enrichment.md
│       └── Script: scripts/scrape_contact_info.py
│       └── Extract: emails, phone numbers, social profiles
│
├── Auto-detect URL type → references/url-detect.md
│   └── Script: scripts/scrape_content_by_url.py
│
├── Trend Analysis (NEW)
│   └── Enriched trend analysis → workflows/trend-analysis.md
│       └── Script: scripts/analyze_trends.py
│       └── Features: velocity scoring, lifecycle staging, opportunity scoring
│
└── Workflows (multi-step)
    ├── Lead generation → workflows/lead-generation.md
    ├── Influencer discovery → workflows/influencer-discovery.md
    ├── Competitor analysis → workflows/competitor-intel.md
    ├── Trend analysis → workflows/trend-analysis.md
    └── Competitor Ads Intelligence (NEW) → workflows/competitor-ads.md
        └── Script: scripts/scrape_competitor_ads.py
        └── Platforms: Facebook Ads Library, Google Ads Transparency
        └── Features: Spend estimates, creative analysis, benchmarking
```

## Environment Setup

```bash
# Required in .env
APIFY_TOKEN=apify_api_xxxxx
```

Get your API key: https://console.apify.com/account/integrations

## Common Usage Patterns

### Scrape Twitter Trends
```bash
python scripts/scrape_twitter_ai_trends.py --query "AI agents" --max-tweets 50
```

### Scrape Reddit Discussions
```bash
python scripts/scrape_reddit_ai_tech.py --subreddits "MachineLearning,LocalLLaMA" --max-posts 100
```

### Scrape LinkedIn Author
```bash
python scripts/scrape_linkedin_posts.py author "https://linkedin.com/in/username" --max-posts 30
```

### Auto-detect and Scrape URL
```bash
python scripts/scrape_content_by_url.py "https://x.com/user/status/123456"
```

### Scrape Instagram Profile
```bash
python scripts/scrape_instagram.py profile "https://instagram.com/username" --max-posts 20
```

### Scrape Instagram Hashtag
```bash
python scripts/scrape_instagram.py hashtag "#artificialintelligence" --max-posts 50
```

### Scrape Instagram Reels
```bash
python scripts/scrape_instagram.py reels "https://instagram.com/username" --max-reels 30
```

### Scrape Facebook Page
```bash
python scripts/scrape_facebook.py page "https://facebook.com/pagename" --max-posts 50
```

### Scrape Facebook Reviews
```bash
python scripts/scrape_facebook.py reviews "https://facebook.com/pagename" --max-reviews 100
```

### Scrape Facebook Marketplace
```bash
python scripts/scrape_facebook.py marketplace "laptops in san francisco" --max-items 30
```

### Scrape Google Maps Businesses
```bash
python scripts/scrape_google_maps.py search "AI consulting firms in New York" --max-results 50
```

### Scrape Google Maps Reviews
```bash
python scripts/scrape_google_maps.py reviews "ChIJN1t_tDeuEmsRUsoyG83frY4" --max-reviews 100
```

### Extract Contact Info from Websites
```bash
python scripts/scrape_contact_info.py "https://example.com" --depth 2
```

### Bulk Contact Enrichment
```bash
python scripts/scrape_contact_info.py --urls-file companies.txt --output contacts.json
```

### Scrape Competitor Ads (Single Competitor)
```bash
python scripts/scrape_competitor_ads.py "Nike" --platforms facebook google --country US --days 30
```

### Compare Multiple Competitors' Ads
```bash
python scripts/scrape_competitor_ads.py "Nike" "Adidas" "Puma" --compare --output comparison.json
```

### Discover Advertisers by Keyword
```bash
python scripts/scrape_competitor_ads.py --search "running shoes" --country US --max-ads 200
```

### Filter Competitor Ads by Media Type
```bash
python scripts/scrape_competitor_ads.py "Netflix" "Disney+" --platforms facebook --media-types video --days 7
```

### Analyze Trends (NEW)
```bash
# Analyze specific topic with enrichments
python scripts/analyze_trends.py "artificial intelligence" --sources google instagram tiktok --days 90

# Discover trending topics in category
python scripts/analyze_trends.py --category technology --discover --top 50

# Compare multiple trends
python scripts/analyze_trends.py "AI" "blockchain" "metaverse" --compare

# Export HTML trend report
python scripts/analyze_trends.py "sustainable fashion" --format html --output trend_report.html
```

## Cost Estimates

| Platform | Actor | Cost per Item |
|----------|-------|---------------|
| Twitter | kaitoeasyapi/twitter-x-data-tweet-scraper | ~$0.00025 |
| Reddit | trudax/reddit-scraper | ~$0.001-0.005 |
| LinkedIn | harvestapi/linkedin-post-search | ~$0.01-0.05 |
| YouTube | streamers/youtube-scraper | ~$0.01-0.05 |
| TikTok | clockworks/tiktok-scraper | ~$0.005 |
| Instagram (profile) | apify/instagram-profile-scraper | ~$0.005 |
| Instagram (posts) | apify/instagram-post-scraper | ~$0.002-0.005 |
| Instagram (hashtag) | apify/instagram-hashtag-scraper | ~$0.002-0.005 |
| Instagram (reels) | apify/instagram-reel-scraper | ~$0.005-0.01 |
| Instagram (comments) | apify/instagram-comment-scraper | ~$0.001-0.003 |
| Facebook (page) | apify/facebook-pages-scraper | ~$0.005-0.01 |
| Facebook (posts) | apify/facebook-posts-scraper | ~$0.003-0.005 |
| Facebook (reviews) | apify/facebook-reviews-scraper | ~$0.002-0.005 |
| Facebook (groups) | apify/facebook-groups-scraper | ~$0.005-0.01 |
| Facebook (marketplace) | apify/facebook-marketplace-scraper | ~$0.005-0.01 |
| Google Maps (search) | compass/crawler-google-places | ~$0.01-0.02 |
| Google Maps (place) | compass/google-maps-business-scraper | ~$0.01 |
| Google Maps (reviews) | compass/google-maps-reviews-scraper | ~$0.003-0.005 |
| Contact Enrichment | lukaskrivka/contact-info-scraper | ~$0.01-0.03 |
| Google Trends | apify/google-trends-scraper | ~$0.01 |
| Trend Analysis (multi) | Multiple actors | ~$0.50-1.50/run |
| Facebook Ads Library | apify/facebook-ads-scraper | ~$0.75/1K ads |
| Facebook Ads (alt) | curious_coder/facebook-ads-library-scraper | ~$0.50/1K ads |
| Google Ads Transparency | lexis-solutions/google-ads-scraper | ~$1.00/1K ads |
| Google Ads (alt) | xtech/google-ad-transparency-scraper | ~$0.80/1K ads |

## Output Location

All scraped data saves to `.tmp/` with timestamped filenames:
- `.tmp/twitter_ai_trends_YYYYMMDD.json`
- `.tmp/reddit_ai_tech_YYYYMMDD.json`
- `.tmp/linkedin_posts_YYYYMMDD_HHMMSS.json`

## Security Notes

### Credential Handling
- Store `APIFY_TOKEN` in `.env` file (never commit to git)
- Rotate API tokens periodically via Apify Console
- Never log or print API tokens in script output
- Use environment variables, not hardcoded values

### Data Privacy
- Scraped data contains only publicly available content
- Social media posts may include PII (names, handles, profile info)
- Data is stored locally in `.tmp/` directory
- No data is retained by Apify after actor run completes
- Consider data minimization - only scrape what you need

### Access Scopes
- Apify tokens 

Related in Ads & Marketing