Skip to content
clusters: prooflayer · edgemarket · edgefinance · synthforge · mediakit · wordmint · webprobe · locale · comppoint · rollforge · bestiary · statline · matchpoint · retail · agentops · browserworkflow · modelrouter · compose
$ man pdf-to-markdown

/pdf-to-markdown

agentutility / mediakit / pdf-to-markdown
PRICE / CALL
$0.0025
USDC · base mainnet · scheme: exact
METHOD
POST
CLUSTER
mediakit
CATEGORY
utilities
STATUS
live
NAME
pdf-to-markdown converts digital or scanned pdfs to clean markdown with ai-powered, layout-aware extraction on the datalab marker engine
SYNOPSIS
POST https://x402.agentutility.ai/pdf-to-markdown
     Content-Type: application/json
     X-PAYMENT:    <signed-transferWithAuthorization>

     { ... }
↳ first call → 402 Payment Required. Sign USDCtransferWithAuthorization, retry with theX-PAYMENT header.
DESCRIPTION

Converts digital or scanned PDFs to clean Markdown with AI-powered, layout-aware extraction on the Datalab Marker engine. Preserves headings, tables, equations (LaTeX), bulleted lists, and multi-column flow; outputs Markdown (default), HTML, or structured JSON with per-page blocks. 30 pages max. Use it as a PDF parser, PDF to text converter, OCR PDF reader, extract-tables-from-PDF tool, equation-aware PDF parser, scanned-PDF OCR, or PDF data extractor.

INPUTrequest schema
propertytypedescriptionreq?
pdf_urlstringPublic URL of a PDF file (http or https). Must be directly fetchable, not behind auth or a viewer redirect. Max 30 pages.required
output_formatstring'markdown' (default — best for LLM downstream), 'html' (preserves more layout structure), or 'json' (per-page blocks with type + bbox).
enum: markdown · html · json
optional
OUTPUTresponse shape
fieldtypedescription
markdownstringMarkdown content (present when output_format is 'markdown')
htmlstringHTML content (present when output_format is 'html')
jsonobjectStructured JSON content (present when output_format is 'json')
output_formatstringEcho of the format used
page_countnumberNumber of pages in the source PDF
source_urlstringEcho of the input URL
EXAMPLEStwo ways to call
EXAMPLE 1 · curl
curl -X POST https://x402.agentutility.ai/pdf-to-markdown \
  -H 'Content-Type: application/json' \
  -d '{ }'
first response = 402 Payment Required with payment requirements; sign + retry with X-PAYMENT.
EXAMPLE 2 · mcp
# Install the MCP package for this endpoint's cluster
npx -y @agentutility/mcp-<cluster>

# Required: EVM private key with USDC on Base
export X402_PRIVATE_KEY=0x...

# Then call the pdf-to-markdown tool from your MCP-aware agent.
MCP server handles payment automatically — your coding agent just calls the tool by name.
METADATA
tags
pdfmarkdownhtmljsonconversion
env
DATALAB_API_KEY
methods
POST
cluster
mediakit
price
$0.0025 USDC per call
ADJACENTother endpoints in mediakit
endpointdescriptionprice
convert-pdfConverts PDFs to Markdown, HTML, JSON, or structured text with the Datalab Marker AI pipeline, preserving headings, tables, equations, an…$0.0025
ocrRuns OCR on scanned PDFs and image-based documents, returning clean Markdown or plain text.$0.0025
pdf-parser-apiParses a public PDF URL into Markdown, HTML, or JSON blocks with layout-aware text, headings, tables, and equations.$0.0025
pdf-text-extractorExtracts clean Markdown, HTML, or structured JSON from digital or scanned PDFs while preserving reading order, tables, and equations.$0.0025
pdf-to-markdown-apiConverts a public PDF URL into clean Markdown, HTML, or structured JSON while preserving headings, tables, equations, and reading order.$0.0025
pdf-to-textExtracts text from digital or scanned PDFs, preserving reading order across multi-column layouts with an AI + OCR pipeline (Datalab Marker).$0.0025
pdf-to-text-apiExtracts text from digital and scanned PDFs as Markdown, plain text, HTML, or JSON with layout-aware reading order.$0.0025
compress-pdfPDF compressor / PDF size reducer.$0.005
SEE ALSO
agentutility · mediakit · x402 · mcp · llms.txt · registry.json · bazaar.x402.org