Web Scraping
API Reference
Complete request and response schema for the Web Scraping API.
POST /scrape
Scrape a web page and return the fully rendered HTML content.
Endpoint
POST https://api.pagepry.com/v1/scrapeHeaders
| Header | Required | Description |
|---|---|---|
x-api-key | Yes | Your PagePry API key |
Content-Type | Yes | Must be application/json |
Request body
| Parameter | Type | Default | Description |
|---|---|---|---|
url | string | required | The URL to scrape. Must be a valid URL. |
waitFor | string | "auto" | Page readiness strategy. One of: "auto", "networkidle", "domcontentloaded". See Wait Strategies. |
timeoutMs | number | 30000 | Maximum wait time in ms (1,000–120,000). |
proxy | string | "none" | Proxy type: "none", "datacenter", "residential". See Proxies. |
headers | object | — | Custom HTTP headers as key-value pairs. |
cookies | array | — | Cookies to set before navigation. Each entry: {name, value, domain}. See Cookies & Headers. |
cache | boolean | true | Use cached results if available. Cached responses cost 0 credits. |
Example request
curl -X POST https://api.pagepry.com/v1/scrape \
-H "x-api-key: pp_live_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"waitFor": "auto",
"proxy": "datacenter",
"cache": true
}'Success response (200)
{
"success": true,
"html": "<!DOCTYPE html><html>...</html>",
"metadata": {
"statusCode": 200,
"url": "https://example.com",
"resolvedUrl": "https://example.com/",
"contentType": "text/html; charset=utf-8",
"renderStrategy": "ssr-early-return",
"loadTimeMs": 847,
"fromCache": false
}
}| Field | Type | Description |
|---|---|---|
success | boolean | Always true for successful responses. |
html | string | The fully rendered HTML content of the page. |
metadata.statusCode | number | HTTP status code of the target page. |
metadata.url | string | The original requested URL. |
metadata.resolvedUrl | string | The final URL after any redirects. |
metadata.contentType | string | Content-Type header of the response. |
metadata.renderStrategy | string | The readiness strategy used (e.g., ssr-early-return, network-idle). |
metadata.loadTimeMs | number | Time in milliseconds to load and render the page. |
metadata.fromCache | boolean | Whether this result was served from cache. |
Error response
{
"success": false,
"error": "Human-readable error message"
}See Errors for the full list of error codes and HTTP status codes.

