Skip to content
clusters: prooflayer · edgemarket · edgefinance · synthforge · mediakit · wordmint · webprobe · locale · comppoint
$ man pdf-text-extractor

/pdf-text-extractor

agentutility / mediakit / pdf-text-extractor
PRICE / CALL
$0.20
USDC · base mainnet · scheme: exact
METHOD
POST
CLUSTER
mediakit
CATEGORY
uncategorized
STATUS
live
NAME
pdf-text-extractor pdf text extractor / pdf to text api / ocr pdf reader
synonym alias of pdf-to-markdown — reuses the canonical handler.
SYNOPSIS
POST https://x402.agentutility.ai/pdf-text-extractor
     Content-Type: application/json
     X-PAYMENT:    <signed-transferWithAuthorization>

     { ... }
↳ first call → 402 Payment Required. Sign USDCtransferWithAuthorization, retry with theX-PAYMENT header.
DESCRIPTION

PDF text extractor / PDF to text API / OCR PDF reader. Extracts clean Markdown, HTML, or structured JSON from digital or scanned PDFs while preserving reading order, tables, and equations. Datalab Marker backend, 30 pages max.

INPUTrequest schema
propertytypedescriptionreq?
pdf_urlstringPublic URL of a PDF file (http or https). Must be directly fetchable, not behind auth or a viewer redirect. Max 30 pages.required
output_formatstring'markdown' (default — best for LLM downstream), 'html' (preserves more layout structure), or 'json' (per-page blocks with type + bbox).
enum: markdown · html · json
optional
OUTPUTresponse shape
fieldtypedescription
markdownstring
page_countstring
source_urlstring
EXAMPLEStwo ways to call
EXAMPLE 1 · curl
curl -X POST https://x402.agentutility.ai/pdf-text-extractor \
  -H 'Content-Type: application/json' \
  -d '{ }'
first response = 402 Payment Required with payment requirements; sign + retry with X-PAYMENT.
EXAMPLE 2 · mcp
# Install the MCP package for this endpoint's cluster
npx -y @agentutility/mcp-<cluster>

# Required: EVM private key with USDC on Base
export X402_PRIVATE_KEY=0x...

# Then call the pdf-text-extractor tool from your MCP-aware agent.
MCP server handles payment automatically — your coding agent just calls the tool by name.
METADATA
tags
mediakitpdftextextractorpdf-text-extractor
methods
POST
cluster
mediakit
price
$0.20 USDC per call
ADJACENTother endpoints in mediakit
endpointdescriptionprice
convert-pdfPDF converter / convert PDF to Markdown / HTML / JSON / structured text.$0.20
ocrOCR / optical character recognition / scanned document extractor / image-PDF to text.$0.20
pdf-parser-apiPDF parser API / PDF content extractor / scanned PDF OCR API.$0.20
pdf-to-markdownPDF parser / PDF to Markdown / PDF to text / OCR PDF / extract tables from PDF / scanned-PDF OCR / PDF reader / PDF data extractor / equa…$0.20
pdf-to-markdown-apiPDF to Markdown API / PDF parser API / PDF to text API / scanned PDF OCR API.$0.20
pdf-to-textExtract text from PDF / PDF to plain text / pdftotext / pdf2txt / PDF text extractor / scanned PDF OCR / read PDF / parse PDF / PDF conte…$0.20
pdf2mdPDF to Markdown converter.$0.20
doc-to-jsonDocument to structured JSON / PDF/DOCX/PPT/XLSX/image to JSON / file parser with schema / invoice extractor / resume parser / contract ex…$0.10
SEE ALSO
agentutility · mediakit · x402 · mcp · llms.txt · registry.json · bazaar.x402.org