Skip to content
clusters: prooflayer · edgemarket · edgefinance · synthforge · mediakit · wordmint · webprobe · locale · comppoint · rollforge · bestiary · statline · matchpoint · retail · agentops · browserworkflow · modelrouter · compose
$ man content-simhash

/content-simhash

agentutility / wordmint / content-simhash
PRICE / CALL
$0.005
USDC · base mainnet · scheme: exact
METHOD
POST
CLUSTER
wordmint
CATEGORY
uncategorized
STATUS
live
NAME
content-simhash fingerprints text with a 64-bit simhash for near-duplicate detection, computed entirely locally
SYNOPSIS
POST https://x402.agentutility.ai/content-simhash
     Content-Type: application/json
     X-PAYMENT:    <signed-transferWithAuthorization>

     { ... }
↳ first call → 402 Payment Required. Sign USDCtransferWithAuthorization, retry with theX-PAYMENT header.
DESCRIPTION

Fingerprints text with a 64-bit SimHash for near-duplicate detection, computed entirely locally. Uses token-level k-shingles (default k=3) with FNV-1a; two SimHashes are 'close' (small Hamming distance) iff the underlying texts share many shingles. Returns hex + decimal forms plus token + shingle counts. Useful for content dedup pipelines, plagiarism detection, and bot-content clustering. Use it as a content fingerprint, dedup hash, or locality-sensitive hash.

INPUTrequest schema
propertytypedescriptionreq?
textstringText to hash. Up to 500,000 chars.required
shingle_sizenumberk-gram size for shingles. Range [1, 8]. Default 3.optional
OUTPUTresponse shape
fieldtypedescription
hash_hexstring64-bit SimHash fingerprint as a 16-character lowercase hex string.
hash_intstringSame 64-bit SimHash rendered as a decimal integer string (safe for languages without u64).
bit_countstringNumber of set bits (popcount) in the SimHash, useful as a quick sanity check.
token_countstringNumber of tokens extracted from the input text before shingling.
shingle_countstringNumber of distinct k-shingles hashed into the SimHash.
shingle_sizestringShingle width k used (tokens per shingle), default 3.
text_charsstringCharacter length of the input text that was hashed.
sourcestringEchoes how the text was supplied (e.g. inline text vs fetched URL).
EXAMPLEStwo ways to call
EXAMPLE 1 · curl
curl -X POST https://x402.agentutility.ai/content-simhash \
  -H 'Content-Type: application/json' \
  -d '{ }'
first response = 402 Payment Required with payment requirements; sign + retry with X-PAYMENT.
EXAMPLE 2 · mcp
# Install the MCP package for this endpoint's cluster
npx -y @agentutility/mcp-<cluster>

# Required: EVM private key with USDC on Base
export X402_PRIVATE_KEY=0x...

# Then call the content-simhash tool from your MCP-aware agent.
MCP server handles payment automatically — your coding agent just calls the tool by name.
METADATA
tags
wordmintcontent-hashingsimhashnear-duplicate-detectiondedupfingerprintinglocality-sensitive-hashshingling
methods
POST
cluster
wordmint
price
$0.005 USDC per call
ADJACENTother endpoints in wordmint
endpointdescriptionprice
brand-taglineGenerates brand taglines and slogans for launch pages, X bios, email copy, and product cards.$0.005
brand-tagline-generateGenerates tagline options for a brand or startup from its name, concept, audience, and tone.$0.005
card-resolveNormalizes free-form graded card text into a canonical card object.$0.005
cron-parseCron parser.$0.005
detect-languageLanguage detector / language identification.$0.005
dictionary-defineLooks up English word definitions with pronunciation, part of speech, and synonyms.$0.005
embedding-similarityMeasures how semantically similar two strings are: embeds both via Venice (default model: text-embedding-bge-m3) and returns the cosine s…$0.005
extract-entitiesNamed entity recognition (NER) / entity extractor.$0.005
SEE ALSO
agentutility · wordmint · x402 · mcp · llms.txt · registry.json · bazaar.x402.org