Name: link-extract
Price: 0.005 USDC
Availability: InStock

$ man link-extract

agentutility / web-probe / link-extract

PRICE / CALL

$0.005

USDC · base mainnet · scheme: exact

METHOD

POST

CLUSTER

webprobe

CATEGORY

uncategorized

STATUS

● live

NAME

link-extract — extracts every link from a webpage: fetches the html url and returns each <a> link with its anchor text, rel attribute, and an is_externa…

SYNOPSIS

POST https://x402.agentutility.ai/link-extract
     Content-Type: application/json
     X-PAYMENT:    <signed-transferWithAuthorization>

     { ... }

↳ first call → 402 Payment Required. Sign USDCtransferWithAuthorization, retry with theX-PAYMENT header.

DESCRIPTION

Extracts every link from a webpage: fetches the HTML URL and returns each <a> link with its anchor text, rel attribute, and an is_external flag. Resolves relative URLs against the page's <base> or final URL. Lighter than full scrape / metadata endpoints; the exact tool for the agent task 'walk this page, pick which links to follow.' Default 500-link cap. SSRF-guarded (no loopback / RFC1918 targets). Use it as a link extractor, page outlink crawler, or to scrape outbound links and get hrefs from a page.

INPUT — request schema

property	type	description	req?
url	string	Page URL to fetch and extract from. http or https only. Private/loopback targets are rejected. Redirects followed.	required
include_external_only	boolean	If true, drop same-host links. Default false (return all).	optional
max_links	number	Max links to return (1-2000). Default 500. Pagination beyond max isn't supported; tighten the URL instead.	optional
include_text	boolean	If true (default), include the anchor's visible text. Set false to skip text extraction.	optional
timeout_ms	number	Fetch timeout. Default 12000, max 25000.	optional

OUTPUT — response shape

field	type	description
url	string	Original URL submitted by the caller before any redirect resolution.
final_url	string	URL after following redirects; used as the fallback base for resolving relative hrefs.
base_url	string	Effective base URL used to resolve relative links, from the page's <base> tag or final_url.
page_title	stringnull	Text inside the page's <title> tag, or null if the page has none.
links	array	Array of extracted anchors, each with href, anchor text, rel attribute, and is_external flag.
count	integer	Number of links returned in the links array after the 500-link cap is applied.
total_found	integer	Total anchor count discovered on the page before the max-link cap truncated the list.
truncated_at_max	boolean	True when total_found exceeded the 500-link cap and links was trimmed.
source	string	Identifier of the extractor pipeline that produced the result, e.g. link-extract worker name.
attribution	string	Required credit string for the link-extract endpoint when republishing the extracted data.

EXAMPLES — two ways to call

EXAMPLE 1 · curl

curl -X POST https://x402.agentutility.ai/link-extract \
  -H 'Content-Type: application/json' \
  -d '{ }'

first response = 402 Payment Required with payment requirements; sign + retry with X-PAYMENT.

EXAMPLE 2 · mcp

# Install the MCP package for this endpoint's cluster
npx -y @agentutility/mcp-<cluster>

# Required: EVM private key with USDC on Base
export X402_PRIVATE_KEY=0x...

# Then call the link-extract tool from your MCP-aware agent.

MCP server handles payment automatically — your coding agent just calls the tool by name.

METADATA

tags: linklinksextractanchorhrefcrawlerscrape
methods: POST
cluster: webprobe
price: $0.005 USDC per call

ADJACENT — other endpoints in webprobe

endpoint	description	price
arxiv-bibtex	Turns an arXiv paper into a BibTeX entry: pulls title, authors, year, abstract, and DOI from the arXiv API and generates a properly-forma…	$0.005
brand-name-score	Scores a candidate brand or startup name on quality and risk.	$0.005
company-name-score	Scores the quality of a company name before you commit to domain, handle, trademark, and launch work.	$0.005
crypto-headlines	Searches recent bitcoin, ethereum, and DeFi headlines via GDELT with a GNews fallback when configured, returning headline URLs, domains,…	$0.005
crypto-news	Fetches recent cryptocurrency news headlines from GDELT with a GNews fallback when configured, filtered by crypto topic or caller query,…	$0.005
disposable-email-check	Detects disposable and throwaway email addresses before they get through your signup form.	$0.005
domain-availability	Checks whether a domain is registered and returns registrar, registration date, expiry date, days_until_expiry, and current EPP status flags.	$0.005
email	Validates an email address end to end: syntax, MX reachability, disposable/temp-mail domains, role accounts, and SPF/DMARC/DKIM posture.	$0.005