Name: text-to-speech
Price: 0.05 USDC
Availability: InStock

$ man text-to-speech

agentutility / synthforge / text-to-speech

PRICE / CALL

$0.05

USDC · base mainnet · scheme: exact

METHOD

POST

CLUSTER

synthforge

CATEGORY

STATUS

● live

NAME

text-to-speech — converts text to speech with 30+ voices and 5 audio formats

SYNOPSIS

POST https://x402.agentutility.ai/text-to-speech
     Content-Type: application/json
     X-PAYMENT:    <signed-transferWithAuthorization>

     { ... }

↳ first call → 402 Payment Required. Sign USDCtransferWithAuthorization, retry with theX-PAYMENT header.

DESCRIPTION

Converts text to speech with 30+ voices and 5 audio formats. Morpheus primary for Kokoro, Venice fallback and alternate TTS models (xAI / ElevenLabs / Orpheus / MiniMax / Gemini), with fal.ai storage for hosted audio URLs. Use it as a TTS API or voice generator.

INPUT — request schema

property	type	description	req?
text	string	—	required
voice	string	—	optional
model	string	—	optional
speed	number	—	optional
format	string	— enum: mp3 · wav · opus · aac · flac	optional

OUTPUT — response shape

field	type	description
audio_url	string	Hosted MP3 URL pointing to the generated speech audio file.
file_size_bytes	number	Size of the generated MP3 file in bytes.
content_type	string	MIME type of the audio file, typically audio/mpeg for MP3 output.
format	string	Audio container format returned, one of 6 supported formats (mp3, opus, aac, flac, wav, pcm).
voice	string	Voice identifier used for synthesis, drawn from the 30+ available voices.
model	string	TTS model that produced the audio (Kokoro, xAI, ElevenLabs, Orpheus, MiniMax, or Gemini).
speed	number	Playback speed multiplier applied during synthesis, where 1.0 is normal pace.
input_chars	number	Character count of the input text that was synthesized into speech.

EXAMPLES — two ways to call

EXAMPLE 1 · curl

curl -X POST https://x402.agentutility.ai/text-to-speech \
  -H 'Content-Type: application/json' \
  -d '{ }'

first response = 402 Payment Required with payment requirements; sign + retry with X-PAYMENT.

EXAMPLE 2 · mcp

# Install the MCP package for this endpoint's cluster
npx -y @agentutility/mcp-<cluster>

# Required: EVM private key with USDC on Base
export X402_PRIVATE_KEY=0x...

# Then call the text-to-speech tool from your MCP-aware agent.

MCP server handles payment automatically — your coding agent just calls the tool by name.

METADATA

tags: ttsspeechaudiovoiceai
env: VENICE_API_KEY · FAL_KEY
methods: POST
cluster: synthforge
price: $0.05 USDC per call

ADJACENT — other endpoints in synthforge

endpoint	description	price
music-generate	Generates music from a text prompt via Venice using the minimax-music-v26 model.	$0.05
voice	Converts text to speech with 30+ voices and MP3/WAV/OPUS/AAC/FLAC output.	$0.05
image-generate-pro	Premium text-to-image generation across margin-safe Venice models at a competitive $0.06/call.	$0.06
recraft	Generates SFW design and illustration images with Venice's recraft-v4 model on a dedicated endpoint.	$0.06
seedream	Generates SFW images with Venice's seedream-v4 model on a dedicated endpoint.	$0.06
flux-2-pro	Generates SFW images with Venice's flux-2-pro model on a dedicated endpoint.	$0.04
qwen-image	Generates SFW images with Venice's qwen-image model on a dedicated endpoint.	$0.04
background-remove	Removes the background from a public image URL and returns the subject with alpha transparency.	$0.08