$ man link-extract
/link-extract
PRICE / CALL
$0.005
USDC · base mainnet · scheme: exact
METHOD
POST
CLUSTER
webprobeCATEGORY
uncategorized
STATUS
● live
NAME
link-extract — extracts every link from a webpage: fetches the html url and returns each <a> link with its anchor text, rel attribute, and an is_externa…
SYNOPSIS
POST https://x402.agentutility.ai/link-extract
Content-Type: application/json
X-PAYMENT: <signed-transferWithAuthorization>
{ ... }↳ first call →
402 Payment Required. Sign USDCtransferWithAuthorization, retry with theX-PAYMENT header.DESCRIPTION
Extracts every link from a webpage: fetches the HTML URL and returns each <a> link with its anchor text, rel attribute, and an is_external flag. Resolves relative URLs against the page's <base> or final URL. Lighter than full scrape / metadata endpoints; the exact tool for the agent task 'walk this page, pick which links to follow.' Default 500-link cap. SSRF-guarded (no loopback / RFC1918 targets). Use it as a link extractor, page outlink crawler, or to scrape outbound links and get hrefs from a page.
INPUT — request schema
| property | type | description | req? |
|---|---|---|---|
| url | string | Page URL to fetch and extract from. http or https only. Private/loopback targets are rejected. Redirects followed. | required |
| include_external_only | boolean | If true, drop same-host links. Default false (return all). | optional |
| max_links | number | Max links to return (1-2000). Default 500. Pagination beyond max isn't supported; tighten the URL instead. | optional |
| include_text | boolean | If true (default), include the anchor's visible text. Set false to skip text extraction. | optional |
| timeout_ms | number | Fetch timeout. Default 12000, max 25000. | optional |
OUTPUT — response shape
| field | type | description |
|---|---|---|
| url | string | Original URL submitted by the caller before any redirect resolution. |
| final_url | string | URL after following redirects; used as the fallback base for resolving relative hrefs. |
| base_url | string | Effective base URL used to resolve relative links, from the page's <base> tag or final_url. |
| page_title | stringnull | Text inside the page's <title> tag, or null if the page has none. |
| links | array | Array of extracted anchors, each with href, anchor text, rel attribute, and is_external flag. |
| count | integer | Number of links returned in the links array after the 500-link cap is applied. |
| total_found | integer | Total anchor count discovered on the page before the max-link cap truncated the list. |
| truncated_at_max | boolean | True when total_found exceeded the 500-link cap and links was trimmed. |
| source | string | Identifier of the extractor pipeline that produced the result, e.g. link-extract worker name. |
| attribution | string | Required credit string for the link-extract endpoint when republishing the extracted data. |
EXAMPLES — two ways to call
EXAMPLE 1 · curl
curl -X POST https://x402.agentutility.ai/link-extract \
-H 'Content-Type: application/json' \
-d '{ }'first response =
402 Payment Required with payment requirements; sign + retry with X-PAYMENT.EXAMPLE 2 · mcp
# Install the MCP package for this endpoint's cluster npx -y @agentutility/mcp-<cluster> # Required: EVM private key with USDC on Base export X402_PRIVATE_KEY=0x... # Then call the link-extract tool from your MCP-aware agent.
MCP server handles payment automatically — your coding agent just calls the tool by name.
METADATA
- tags
- linklinksextractanchorhrefcrawlerscrape
- methods
- POST
- cluster
- webprobe
- price
- $0.005 USDC per call
ADJACENT — other endpoints in webprobe
| endpoint | description | price |
|---|---|---|
| arxiv-bibtex | Turns an arXiv paper into a BibTeX entry: pulls title, authors, year, abstract, and DOI from the arXiv API and generates a properly-forma… | $0.005 |
| brand-name-score | Scores a candidate brand or startup name on quality and risk. | $0.005 |
| company-name-score | Scores the quality of a company name before you commit to domain, handle, trademark, and launch work. | $0.005 |
| crypto-headlines | Searches recent bitcoin, ethereum, and DeFi headlines via GDELT with a GNews fallback when configured, returning headline URLs, domains,… | $0.005 |
| crypto-news | Fetches recent cryptocurrency news headlines from GDELT with a GNews fallback when configured, filtered by crypto topic or caller query,… | $0.005 |
| disposable-email-check | Detects disposable and throwaway email addresses before they get through your signup form. | $0.005 |
| domain-availability | Checks whether a domain is registered and returns registrar, registration date, expiry date, days_until_expiry, and current EPP status flags. | $0.005 |
| Validates an email address end to end: syntax, MX reachability, disposable/temp-mail domains, role accounts, and SPF/DMARC/DKIM posture. | $0.005 |
SEE ALSO