Skip to content
clusters: prooflayer · edgemarket · edgefinance · synthforge · mediakit · wordmint · webprobe · locale · comppoint · rollforge · bestiary · statline · matchpoint · retail · agentops · browserworkflow · modelrouter · compose
$ man link-extract

/link-extract

agentutility / web-probe / link-extract
PRICE / CALL
$0.005
USDC · base mainnet · scheme: exact
METHOD
POST
CLUSTER
webprobe
CATEGORY
uncategorized
STATUS
live
NAME
link-extract extracts every link from a webpage: fetches the html url and returns each <a> link with its anchor text, rel attribute, and an is_externa…
SYNOPSIS
POST https://x402.agentutility.ai/link-extract
     Content-Type: application/json
     X-PAYMENT:    <signed-transferWithAuthorization>

     { ... }
↳ first call → 402 Payment Required. Sign USDCtransferWithAuthorization, retry with theX-PAYMENT header.
DESCRIPTION

Extracts every link from a webpage: fetches the HTML URL and returns each <a> link with its anchor text, rel attribute, and an is_external flag. Resolves relative URLs against the page's <base> or final URL. Lighter than full scrape / metadata endpoints; the exact tool for the agent task 'walk this page, pick which links to follow.' Default 500-link cap. SSRF-guarded (no loopback / RFC1918 targets). Use it as a link extractor, page outlink crawler, or to scrape outbound links and get hrefs from a page.

INPUTrequest schema
propertytypedescriptionreq?
urlstringPage URL to fetch and extract from. http or https only. Private/loopback targets are rejected. Redirects followed.required
include_external_onlybooleanIf true, drop same-host links. Default false (return all).optional
max_linksnumberMax links to return (1-2000). Default 500. Pagination beyond max isn't supported; tighten the URL instead.optional
include_textbooleanIf true (default), include the anchor's visible text. Set false to skip text extraction.optional
timeout_msnumberFetch timeout. Default 12000, max 25000.optional
OUTPUTresponse shape
fieldtypedescription
urlstringOriginal URL submitted by the caller before any redirect resolution.
final_urlstringURL after following redirects; used as the fallback base for resolving relative hrefs.
base_urlstringEffective base URL used to resolve relative links, from the page's <base> tag or final_url.
page_titlestringnullText inside the page's <title> tag, or null if the page has none.
linksarrayArray of extracted anchors, each with href, anchor text, rel attribute, and is_external flag.
countintegerNumber of links returned in the links array after the 500-link cap is applied.
total_foundintegerTotal anchor count discovered on the page before the max-link cap truncated the list.
truncated_at_maxbooleanTrue when total_found exceeded the 500-link cap and links was trimmed.
sourcestringIdentifier of the extractor pipeline that produced the result, e.g. link-extract worker name.
attributionstringRequired credit string for the link-extract endpoint when republishing the extracted data.
EXAMPLEStwo ways to call
EXAMPLE 1 · curl
curl -X POST https://x402.agentutility.ai/link-extract \
  -H 'Content-Type: application/json' \
  -d '{ }'
first response = 402 Payment Required with payment requirements; sign + retry with X-PAYMENT.
EXAMPLE 2 · mcp
# Install the MCP package for this endpoint's cluster
npx -y @agentutility/mcp-<cluster>

# Required: EVM private key with USDC on Base
export X402_PRIVATE_KEY=0x...

# Then call the link-extract tool from your MCP-aware agent.
MCP server handles payment automatically — your coding agent just calls the tool by name.
METADATA
tags
linklinksextractanchorhrefcrawlerscrape
methods
POST
cluster
webprobe
price
$0.005 USDC per call
ADJACENTother endpoints in webprobe
endpointdescriptionprice
arxiv-bibtexTurns an arXiv paper into a BibTeX entry: pulls title, authors, year, abstract, and DOI from the arXiv API and generates a properly-forma…$0.005
brand-name-scoreScores a candidate brand or startup name on quality and risk.$0.005
company-name-scoreScores the quality of a company name before you commit to domain, handle, trademark, and launch work.$0.005
crypto-headlinesSearches recent bitcoin, ethereum, and DeFi headlines via GDELT with a GNews fallback when configured, returning headline URLs, domains,…$0.005
crypto-newsFetches recent cryptocurrency news headlines from GDELT with a GNews fallback when configured, filtered by crypto topic or caller query,…$0.005
disposable-email-checkDetects disposable and throwaway email addresses before they get through your signup form.$0.005
domain-availabilityChecks whether a domain is registered and returns registrar, registration date, expiry date, days_until_expiry, and current EPP status flags.$0.005
emailValidates an email address end to end: syntax, MX reachability, disposable/temp-mail domains, role accounts, and SPF/DMARC/DKIM posture.$0.005
SEE ALSO
agentutility · webprobe · x402 · mcp · llms.txt · registry.json · bazaar.x402.org