$ man pdf-to-markdown
/pdf-to-markdown
PRICE / CALL
$0.0025
USDC · base mainnet · scheme: exact
METHOD
POST
CLUSTER
mediakitCATEGORY
utilities
STATUS
● live
NAME
pdf-to-markdown — converts digital or scanned pdfs to clean markdown with ai-powered, layout-aware extraction on the datalab marker engine
SYNOPSIS
POST https://x402.agentutility.ai/pdf-to-markdown
Content-Type: application/json
X-PAYMENT: <signed-transferWithAuthorization>
{ ... }↳ first call →
402 Payment Required. Sign USDCtransferWithAuthorization, retry with theX-PAYMENT header.DESCRIPTION
Converts digital or scanned PDFs to clean Markdown with AI-powered, layout-aware extraction on the Datalab Marker engine. Preserves headings, tables, equations (LaTeX), bulleted lists, and multi-column flow; outputs Markdown (default), HTML, or structured JSON with per-page blocks. 30 pages max. Use it as a PDF parser, PDF to text converter, OCR PDF reader, extract-tables-from-PDF tool, equation-aware PDF parser, scanned-PDF OCR, or PDF data extractor.
INPUT — request schema
| property | type | description | req? |
|---|---|---|---|
| pdf_url | string | Public URL of a PDF file (http or https). Must be directly fetchable, not behind auth or a viewer redirect. Max 30 pages. | required |
| output_format | string | 'markdown' (default — best for LLM downstream), 'html' (preserves more layout structure), or 'json' (per-page blocks with type + bbox). enum: markdown · html · json | optional |
OUTPUT — response shape
| field | type | description |
|---|---|---|
| markdown | string | Markdown content (present when output_format is 'markdown') |
| html | string | HTML content (present when output_format is 'html') |
| json | object | Structured JSON content (present when output_format is 'json') |
| output_format | string | Echo of the format used |
| page_count | number | Number of pages in the source PDF |
| source_url | string | Echo of the input URL |
EXAMPLES — two ways to call
EXAMPLE 1 · curl
curl -X POST https://x402.agentutility.ai/pdf-to-markdown \
-H 'Content-Type: application/json' \
-d '{ }'first response =
402 Payment Required with payment requirements; sign + retry with X-PAYMENT.EXAMPLE 2 · mcp
# Install the MCP package for this endpoint's cluster npx -y @agentutility/mcp-<cluster> # Required: EVM private key with USDC on Base export X402_PRIVATE_KEY=0x... # Then call the pdf-to-markdown tool from your MCP-aware agent.
MCP server handles payment automatically — your coding agent just calls the tool by name.
METADATA
- tags
- pdfmarkdownhtmljsonconversion
- env
- DATALAB_API_KEY
- methods
- POST
- cluster
- mediakit
- price
- $0.0025 USDC per call
ADJACENT — other endpoints in mediakit
| endpoint | description | price |
|---|---|---|
| convert-pdf | Converts PDFs to Markdown, HTML, JSON, or structured text with the Datalab Marker AI pipeline, preserving headings, tables, equations, an… | $0.0025 |
| ocr | Runs OCR on scanned PDFs and image-based documents, returning clean Markdown or plain text. | $0.0025 |
| pdf-parser-api | Parses a public PDF URL into Markdown, HTML, or JSON blocks with layout-aware text, headings, tables, and equations. | $0.0025 |
| pdf-text-extractor | Extracts clean Markdown, HTML, or structured JSON from digital or scanned PDFs while preserving reading order, tables, and equations. | $0.0025 |
| pdf-to-markdown-api | Converts a public PDF URL into clean Markdown, HTML, or structured JSON while preserving headings, tables, equations, and reading order. | $0.0025 |
| pdf-to-text | Extracts text from digital or scanned PDFs, preserving reading order across multi-column layouts with an AI + OCR pipeline (Datalab Marker). | $0.0025 |
| pdf-to-text-api | Extracts text from digital and scanned PDFs as Markdown, plain text, HTML, or JSON with layout-aware reading order. | $0.0025 |
| compress-pdf | PDF compressor / PDF size reducer. | $0.005 |
SEE ALSO