$ man content-simhash
/content-simhash
PRICE / CALL
$0.005
USDC · base mainnet · scheme: exact
METHOD
POST
CLUSTER
wordmintCATEGORY
uncategorized
STATUS
● live
NAME
content-simhash — fingerprints text with a 64-bit simhash for near-duplicate detection, computed entirely locally
SYNOPSIS
POST https://x402.agentutility.ai/content-simhash
Content-Type: application/json
X-PAYMENT: <signed-transferWithAuthorization>
{ ... }↳ first call →
402 Payment Required. Sign USDCtransferWithAuthorization, retry with theX-PAYMENT header.DESCRIPTION
Fingerprints text with a 64-bit SimHash for near-duplicate detection, computed entirely locally. Uses token-level k-shingles (default k=3) with FNV-1a; two SimHashes are 'close' (small Hamming distance) iff the underlying texts share many shingles. Returns hex + decimal forms plus token + shingle counts. Useful for content dedup pipelines, plagiarism detection, and bot-content clustering. Use it as a content fingerprint, dedup hash, or locality-sensitive hash.
INPUT — request schema
| property | type | description | req? |
|---|---|---|---|
| text | string | Text to hash. Up to 500,000 chars. | required |
| shingle_size | number | k-gram size for shingles. Range [1, 8]. Default 3. | optional |
OUTPUT — response shape
| field | type | description |
|---|---|---|
| hash_hex | string | 64-bit SimHash fingerprint as a 16-character lowercase hex string. |
| hash_int | string | Same 64-bit SimHash rendered as a decimal integer string (safe for languages without u64). |
| bit_count | string | Number of set bits (popcount) in the SimHash, useful as a quick sanity check. |
| token_count | string | Number of tokens extracted from the input text before shingling. |
| shingle_count | string | Number of distinct k-shingles hashed into the SimHash. |
| shingle_size | string | Shingle width k used (tokens per shingle), default 3. |
| text_chars | string | Character length of the input text that was hashed. |
| source | string | Echoes how the text was supplied (e.g. inline text vs fetched URL). |
EXAMPLES — two ways to call
EXAMPLE 1 · curl
curl -X POST https://x402.agentutility.ai/content-simhash \
-H 'Content-Type: application/json' \
-d '{ }'first response =
402 Payment Required with payment requirements; sign + retry with X-PAYMENT.EXAMPLE 2 · mcp
# Install the MCP package for this endpoint's cluster npx -y @agentutility/mcp-<cluster> # Required: EVM private key with USDC on Base export X402_PRIVATE_KEY=0x... # Then call the content-simhash tool from your MCP-aware agent.
MCP server handles payment automatically — your coding agent just calls the tool by name.
METADATA
- tags
- wordmintcontent-hashingsimhashnear-duplicate-detectiondedupfingerprintinglocality-sensitive-hashshingling
- methods
- POST
- cluster
- wordmint
- price
- $0.005 USDC per call
ADJACENT — other endpoints in wordmint
| endpoint | description | price |
|---|---|---|
| brand-tagline | Generates brand taglines and slogans for launch pages, X bios, email copy, and product cards. | $0.005 |
| brand-tagline-generate | Generates tagline options for a brand or startup from its name, concept, audience, and tone. | $0.005 |
| card-resolve | Normalizes free-form graded card text into a canonical card object. | $0.005 |
| cron-parse | Cron parser. | $0.005 |
| detect-language | Language detector / language identification. | $0.005 |
| dictionary-define | Looks up English word definitions with pronunciation, part of speech, and synonyms. | $0.005 |
| embedding-similarity | Measures how semantically similar two strings are: embeds both via Venice (default model: text-embedding-bge-m3) and returns the cosine s… | $0.005 |
| extract-entities | Named entity recognition (NER) / entity extractor. | $0.005 |
SEE ALSO