Skip to content
clusters: prooflayer · edgemarket · edgefinance · synthforge · mediakit · wordmint · webprobe · locale · comppoint · rollforge · bestiary · statline · matchpoint · retail · agentops · browserworkflow · modelrouter · compose
$ man scrape

/scrape

PRICE / CALL
$0.04
USDC · base mainnet · scheme: exact
METHOD
POST
CLUSTER
webprobe
CATEGORY
utilities
STATUS
live
NAME
scrape scrape any webpage
SYNOPSIS
POST https://x402.agentutility.ai/scrape
     Content-Type: application/json
     X-PAYMENT:    <signed-transferWithAuthorization>

     { ... }
↳ first call → 402 Payment Required. Sign USDCtransferWithAuthorization, retry with theX-PAYMENT header.
DESCRIPTION

Scrape any webpage. Pulls title, description, canonical URL, OpenGraph + Twitter card metadata, headings, and outbound links from a single URL. Server-side rendering. Body content rendered as text / raw HTML / clean markdown. Optional link extraction. Cheerio-based, no headless browser — fast and cheap, ideal for static pages and SSR sites. Alias of scrape-website. For JS-heavy SPAs that need a real browser, see website-screenshot.

INPUTrequest schema
propertytypedescriptionreq?
urlstringPublic URL to fetch and parse. Must include scheme (http/https). Follows redirects.required
formatstringBody output format. 'text' (default), 'html' (raw), or 'markdown' (clean — best for LLM ingestion).
enum: text · html · markdown
optional
include_linksbooleanIf true, also returns an array of all <a href> links on the page. Default false.optional
user_agentstringCustom User-Agent header. Defaults to a modern desktop Chrome UA.optional
OUTPUTresponse shape
fieldtypedescription
urlstringFinal URL fetched after following redirects.
final_urlstringFinal URL after redirects.
status_codenumberHTTP status code returned by the upstream server.
titlestringPage title from the <title> tag.
descriptionstringMeta description from the page's <meta name="description"> or OG description tag.
canonicalstringCanonical URL from <link rel="canonical"> when present.
langstringLanguage code from the <html lang> attribute, like en or fr.
h1stringText of the first <h1> heading on the page.
ogobjectAll og:* meta tags.
twitterobjectAll twitter:* meta tags.
textstringBody when format=text.
htmlstringBody when format=html.
markdownstringBody when format=markdown.
formatstringOutput format of the body content: text, html, or markdown.
linksarrayArray of {href, text} when include_links=true.
body_charsnumberCharacter count of the extracted body content.
EXAMPLEStwo ways to call
EXAMPLE 1 · curl
curl -X POST https://x402.agentutility.ai/scrape \
  -H 'Content-Type: application/json' \
  -d '{ }'
first response = 402 Payment Required with payment requirements; sign + retry with X-PAYMENT.
EXAMPLE 2 · mcp
# Install the MCP package for this endpoint's cluster
npx -y @agentutility/mcp-<cluster>

# Required: EVM private key with USDC on Base
export X402_PRIVATE_KEY=0x...

# Then call the scrape tool from your MCP-aware agent.
MCP server handles payment automatically — your coding agent just calls the tool by name.
METADATA
tags
scrapefetchhtmlextractmetadataurl
methods
POST
cluster
webprobe
price
$0.04 USDC per call
ADJACENTother endpoints in webprobe
endpointdescriptionprice
answer-webAnswers natural-language questions with live web research, returning a synthesized answer with inline [N] citations and the source URLs.$0.04
arxiv-summarizearXiv paper summarizer / research-paper TLDR.$0.04
rss-from-anythingRSS feed generator / HTML to RSS converter.$0.04
scrape-websiteScrapes any webpage and pulls title, description, canonical URL, OpenGraph + Twitter card metadata, headings, and outbound links from a s…$0.04
screenshotWebsite screenshot / URL to PNG/JPG.$0.04
webpage-diffDetects changes on a webpage: fetches a URL, strips HTML to plain text, computes a SHA-256 hash, and (when given a previous hash or text)…$0.04
website-screenshotCaptures a website screenshot from a URL as PNG or JPG.$0.04
arxiv-searcharXiv full-text search.$0.03
SEE ALSO
agentutility · webprobe · x402 · mcp · llms.txt · registry.json · bazaar.x402.org