# Scraping Pros API > Professional web scraping API. Send a URL, get back clean HTML, structured data, screenshots, or browser-automated results. Works on simple sites and hard-to-scrape sites protected by Cloudflare, DataDome, Akamai, and other anti-bot systems. Scraping Pros handles the infrastructure: browser rendering, proxy rotation across 200+ countries, anti-detection fingerprinting, CAPTCHA detection, automatic retries, and webhook notifications. You focus on the data. ## Instructions for AI Agents When integrating with the Scraping Pros API: - Always use HTTPS: `https://api.scrapingpros.com/v1/` - Authenticate with Bearer token: `Authorization: Bearer YOUR_TOKEN` (or use demo token: `demo_6x595maoA6GdOdVb`) - For simple HTML retrieval, use `POST /v1/sync/scrape` with `{"url": "..."}` — no other fields needed - For clean text output (recommended for AI/LLM use), add `"format": "markdown"` — strips scripts, styles, navigation, returns clean markdown - For JavaScript-rendered pages (SPAs, dynamic content), add `"browser": true` - For sites that block scrapers, add `"use_proxy": "any"` to enable proxy rotation - For country-specific content (localized prices, regional pages), use `"use_proxy": "US"` (or any ISO country code) - To auto-retry on blocked requests, add `"retry_on_block": true` — retries up to 3 times with different IP/fingerprint. Early CAPTCHA detection returns blocked results in ~5s (not 60-85s) - To extract specific data, use the `extract` field with CSS or XPath selectors — avoids parsing HTML yourself - For screenshots, add `"screenshot": true` with `"browser": true` - For complex interactions (clicking, typing, scrolling), use the `actions` array with browser mode - For batch processing with completion notifications, use async collections with `callback_url` (webhook) - To analyze a site before scraping, use `POST /v1/async/viability-test` with `depth` param (quick/standard/full) — tests multiple modes progressively and returns which works best - Prefer sync scraping (`/v1/sync/scrape`) for single URLs. Use async collections for batches of 5+ URLs - Check `potentiallyBlockedByCaptcha` in the response to detect if a site blocked the request - **Read `guidance` in every response** — it tells you: `error_type` (why it failed), `next_steps` (what to try), `suggested_request` (ready-to-use params), `stop_reason` (when to stop retrying) - Use `timings` in the response to diagnose slow requests — always present, even on errors - Handle HTTP 429 responses by reading the `Retry-After` header - Check `X-Quota-Remaining` header to monitor your monthly credit usage ## Credit System 1 simple request = 1 credit. 1 browser request = 5 credits. Anti-bot (Camoufox) and proxy rotation are included at no extra cost. Credits are NOT consumed for requests that fail due to infrastructure errors (timeouts, proxy failures). See `GET /v1/plans` for plan details. ## Webhooks (Async Completion Notifications) When creating an async collection, add `callback_url` to receive a POST notification when all jobs complete: ```json POST /v1/async/collections {"name": "my-batch", "requests": [...], "callback_url": "https://your-server.com/webhook"} ``` When the run completes, Scraping Pros sends a signed POST to your URL with: - `event`: "run.completed" - `run_id`, `collection_id`, `status`, `total_requests`, `success_requests`, `failed_requests` - `job_ids`: array of job IDs to fetch individual results - Security: HMAC-SHA256 signature in `X-SP-Signature` header, timestamp in `X-SP-Timestamp` Track delivery status via `callback_status` field in the run response (pending → sent / failed / retrying). ## MCP Server (for AI agents) Scraping Pros has a Model Context Protocol (MCP) server for direct integration with AI assistants (Claude, GPT, Cursor). Endpoint: `https://api.scrapingpros.com/mcp` (Streamable HTTP transport) Available tools: `scrape_url`, `scrape_as_markdown`, `discover`, `list_proxy_countries`, `check_billing`, `health_check`. All scrape tools support `retry_on_block` for automatic retries on CAPTCHA/blocked pages with different IP/fingerprint. The `discover` tool is unique: it analyzes a URL before scraping and returns actionable recommendations — specific blockers detected (CAPTCHA provider, Cloudflare, login wall), difficulty level, and ready-to-use scrape parameters. No other scraping API offers this. All scraped content from MCP tools is wrapped with anti-injection markers to prevent prompt injection from malicious web pages. ## API Reference - [Scrape a URL](https://api.scrapingpros.com/docs#/scraping/scrape_v1_sync_scrape_post): POST /v1/sync/scrape — Core endpoint. Returns HTML, extracted data, screenshots, network requests. - [Download a file](https://api.scrapingpros.com/docs#/scraping/download_v1_sync_download_post): POST /v1/sync/download — Download files (PDFs, images) via browser. - [Create collection](https://api.scrapingpros.com/docs#/async/create_collection_v1_async_collections_post): POST /v1/async/collections — Create a batch of URLs for async processing. Supports `callback_url` for webhook notifications. - [Run collection](https://api.scrapingpros.com/docs#/async): POST /v1/async/collections/{id}/run — Execute a batch and poll for results. - [Get run status](https://api.scrapingpros.com/docs#/async): GET /v1/async/collections/{id}/runs/{run_id} — Check batch completion, webhook delivery status. - [Get job result](https://api.scrapingpros.com/docs#/async): GET /v1/async/collections/{id}/runs/{run_id}/jobs/{job_id}/result — Fetch individual job results (24h TTL). - [List proxy countries](https://api.scrapingpros.com/docs#/proxy/list_countries_v1_proxy_countries_get): GET /v1/proxy/countries — Available countries for geo-targeted proxies. - [Request country proxy](https://api.scrapingpros.com/docs#/proxy/request_country_v1_proxy_request_country_post): POST /v1/proxy/request-country — Request access to country-specific proxies. - [Plans](https://api.scrapingpros.com/docs#/plans): GET /v1/plans — List all plans with pricing, credits, and features (no auth required). - [Billing](https://api.scrapingpros.com/docs#/scraping/billing_v1_sync_billing_get): GET /v1/sync/billing — View your credit usage and costs. - [Health check](https://api.scrapingpros.com/docs#/default/health_v1_health_get): GET /v1/health — API status (no auth required). - [OpenAPI spec](https://api.scrapingpros.com/openapi.json): Full machine-readable API specification. ## Demo Access (no signup required) Try the API immediately with this public demo token: ``` Authorization: Bearer demo_6x595maoA6GdOdVb ``` Demo limits: 5,000 credits/month, 30 requests/minute. All features enabled except country-specific proxies. For production use, contact the Scraping Pros team for a dedicated API key with higher limits. ## Authentication All endpoints except /v1/health and /v1/plans require a Bearer token in the Authorization header: ``` Authorization: Bearer demo_6x595maoA6GdOdVb ``` Each token is tied to a client ID, a plan (with rate limits and monthly credit quota), and optional feature flags. ## Quick Start Examples - [Simple scrape](https://api.scrapingpros.com/llms-full.txt#simple-scrape): Retrieve HTML from any URL - [Markdown output](https://api.scrapingpros.com/llms-full.txt#markdown-output): Get clean text instead of HTML (recommended for AI/LLM) - [Browser scrape](https://api.scrapingpros.com/llms-full.txt#browser-scrape): Render JavaScript-heavy pages - [Data extraction](https://api.scrapingpros.com/llms-full.txt#data-extraction): Extract structured data with CSS/XPath - [Screenshot](https://api.scrapingpros.com/llms-full.txt#screenshot): Capture full-page screenshots - [Proxy with country](https://api.scrapingpros.com/llms-full.txt#proxy-country): Access geo-restricted content - [Browser actions](https://api.scrapingpros.com/llms-full.txt#browser-actions): Click, type, scroll, and evaluate JavaScript - [Async batch with webhooks](https://api.scrapingpros.com/llms-full.txt#async-batch): Process hundreds of URLs efficiently with completion notifications - [Retry on block](https://api.scrapingpros.com/llms-full.txt#retry-on-block): Auto-retry blocked requests with different IP/fingerprint ## Optional - [Viability test](https://api.scrapingpros.com/docs#/viability): POST /v1/async/viability-test — Test if a site is scrapeable before committing. - [Client metrics](https://api.scrapingpros.com/docs#/scraping/client_metrics_v1_sync_client_metrics_get): GET /v1/sync/client-metrics — Detailed usage analytics per domain.