$ man sitemap-fetch
/sitemap-fetch
PRICE / CALL
$0.005
USDC · base mainnet · scheme: exact
METHOD
POST
CLUSTER
webprobeCATEGORY
uncategorized
STATUS
● live
NAME
sitemap-fetch — fetches and parses a site's sitemap.xml into a full website url inventory
SYNOPSIS
POST https://x402.agentutility.ai/sitemap-fetch
Content-Type: application/json
X-PAYMENT: <signed-transferWithAuthorization>
{ ... }↳ first call →
402 Payment Required. Sign USDCtransferWithAuthorization, retry with theX-PAYMENT header.DESCRIPTION
Fetches and parses a site's sitemap.xml into a full website URL inventory. Accepts a site root (discovers the sitemap via robots.txt or the /sitemap.xml convention) or a direct sitemap.xml URL, and recurses through sitemap-index nesting. Returns the URL list with lastmod / changefreq / priority and aggregate stats (count, oldest/newest lastmod). Useful for SEO audits, content-freshness monitoring, RAG ingestion seeding. Use it as a sitemap parser, sitemap index resolver, SEO sitemap reader, or robots.txt sitemap discovery tool.
INPUT — request schema
| property | type | description | req? |
|---|---|---|---|
| url | string | Site root (e.g. https://example.com) or direct sitemap URL. | required |
| limit | number | Max URL rows. 1-5000. Default 1000. | optional |
| recurse | boolean | Recurse into sitemap-index children. Default true. | optional |
| user_agent | string | Optional User-Agent header. | optional |
OUTPUT — response shape
| field | type | description |
|---|---|---|
| input_url | string | Site root or sitemap URL the caller passed in, echoed back for reference. |
| sitemaps_fetched | string | List of sitemap URLs actually retrieved, including nested sitemap-index children that were resolved. |
| url_count | string | Total number of URL entries found across all fetched sitemaps after recursion. |
| urls | string | Array of URL entries with loc, lastmod, changefreq, and priority fields parsed from the sitemap XML. |
| lastmod_oldest | string | Oldest lastmod timestamp seen across all URL entries, useful for content-freshness checks. |
| lastmod_newest | string | Newest lastmod timestamp seen across all URL entries, indicating most recent site update. |
| truncated | string | Boolean flag set when the URL list hit an internal cap and not every entry was returned. |
| bytes_total | string | Total bytes downloaded across every sitemap and sitemap-index file fetched during the call. |
| errors | string | Array of per-sitemap fetch or parse errors encountered, empty when all sitemaps loaded cleanly. |
EXAMPLES — two ways to call
EXAMPLE 1 · curl
curl -X POST https://x402.agentutility.ai/sitemap-fetch \
-H 'Content-Type: application/json' \
-d '{ }'first response =
402 Payment Required with payment requirements; sign + retry with X-PAYMENT.EXAMPLE 2 · mcp
# Install the MCP package for this endpoint's cluster npx -y @agentutility/mcp-<cluster> # Required: EVM private key with USDC on Base export X402_PRIVATE_KEY=0x... # Then call the sitemap-fetch tool from your MCP-aware agent.
MCP server handles payment automatically — your coding agent just calls the tool by name.
METADATA
- tags
- web-probesitemapseocrawlerrobots-txturl-inventorysitemap-xml
- methods
- POST
- cluster
- webprobe
- price
- $0.005 USDC per call
ADJACENT — other endpoints in webprobe
| endpoint | description | price |
|---|---|---|
| arxiv-bibtex | Turns an arXiv paper into a BibTeX entry: pulls title, authors, year, abstract, and DOI from the arXiv API and generates a properly-forma… | $0.005 |
| brand-name-score | Scores a candidate brand or startup name on quality and risk. | $0.005 |
| company-name-score | Scores the quality of a company name before you commit to domain, handle, trademark, and launch work. | $0.005 |
| crypto-headlines | Searches recent bitcoin, ethereum, and DeFi headlines via GDELT with a GNews fallback when configured, returning headline URLs, domains,… | $0.005 |
| crypto-news | Fetches recent cryptocurrency news headlines from GDELT with a GNews fallback when configured, filtered by crypto topic or caller query,… | $0.005 |
| disposable-email-check | Detects disposable and throwaway email addresses before they get through your signup form. | $0.005 |
| domain-availability | Checks whether a domain is registered and returns registrar, registration date, expiry date, days_until_expiry, and current EPP status flags. | $0.005 |
| Validates an email address end to end: syntax, MX reachability, disposable/temp-mail domains, role accounts, and SPF/DMARC/DKIM posture. | $0.005 |
SEE ALSO