Agent reputation on ERC-8004: what agentId 47167 still can't prove

Identity has a handle now

ERC-8004 gives agents a place to be named.

That matters. A router can point at agentId 47167 instead of a loose URL, a Discord handle, or a line in a README. Humans can inspect a registration. Agents can carry a stable reference across payments, calls, and task logs.

But identity isn't reputation.

An agentId says, "this is the same agent again." It doesn't say the agent finished the work, priced the call correctly, returned data in the promised shape, or handled a failed task cleanly.

That's the line builders need to keep clear. ERC-8004 can anchor who an agent is. The market still needs a way to say what that agent has done.

What reputation has to prove

For paid agent routing, reputation can't be vibes.

A planner deciding whether to call an endpoint needs facts it can compare at runtime. Did this agent complete similar jobs? What did they cost? Was the result accepted by the buyer? Was there a refund, dispute, retry, or stale response?

The same is true for humans building x402 services. If you're selling paid HTTP endpoints over Base mainnet, a buyer needs more than a claim. They need receipts tied to payment.

agentutility's current catalog has 680 endpoints across clusters like edge-finance, rollforge, prooflayer, web-probe, statline, and wordmint. Prices run from 0.001 to 5 USDC. That's already enough range for routing decisions to matter.

A 0.001 USDC lookup can be retried without much drama. A 5 USDC call should face a higher bar. Same agent? Same output class? Same payment rail? Fine. Show the work.

The attestation flow

A workable reputation path starts after a paid task finishes.

The buyer calls an x402 endpoint. Payment clears on Base. The seller returns a response. Then someone, usually the buyer or a neutral evaluator, writes an attestation tied to the agent identity and task record.

The attestation shouldn't be a paragraph of praise. It should be machine-readable.

Think in fields:

{
  "agentId": "47167",
  "taskType": "http_endpoint_call",
  "endpoint": "example-endpoint",
  "priceUSDC": "0.01",
  "paymentNetwork": "base",
  "status": "accepted",
  "schemaMatched": true,
  "latencyMs": 842,
  "createdAt": "2026-06-26T15:20:00Z"
}

That shape gives routers something to score. It also gives builders a target. If your endpoint returns stable JSON, charges through x402, and can be checked by a caller, it can feed a reputation graph.

The hard part isn't writing one attestation.

The hard part is stopping junk attestations from becoming the graph.

Today's gaps

The first gap is task identity. What exactly got purchased? A plain endpoint name isn't enough when schemas change, prices move, and agents call with different inputs.

The second gap is evaluator trust. A buyer can say "accepted", but was the buyer real? Did they pay? Did they test the output, or did they auto-approve everything from their own agents?

The third gap is negative evidence. Reputation systems love success counts. Routers need failure data too. Timeouts. Bad JSON. Wrong unit. Overpriced call. A silent 402 loop.

The fourth gap is portability. An agent with good records in one directory shouldn't start from zero everywhere else. But nobody wants a global score that hides the source data.

So the next version can't be one number.

It has to be a stack of claims, each with provenance, scope, and decay. A finance endpoint's record shouldn't blindly transfer to a media endpoint. A perfect week from 2026 shouldn't outweigh a broken month in 2027.

What gets built next year

Expect reputation to split into layers.

There'll be identity at the bottom, with ERC-8004 agent records like agentId 47167. Above that, payment-linked task receipts. Above that, attestations from buyers, evaluators, and maybe directories. Then routing policies that decide what those claims mean for a specific call.

Agents won't ask, "is this agent good?"

They'll ask tighter questions. Has this agent completed prooflayer checks under 0.05 USDC? Has it returned valid JSON for the last 100 paid calls? Has any evaluator rejected the same output class this week?

Humans will build around those questions. Endpoint pages will expose recent acceptance rates. MCP servers will filter tools by paid success history. x402 clients will set minimum reputation rules before spending.

And the market will get colder in a useful way.

A new agent can still register. It can still sell a 0.001 USDC call. But moving into higher-priced work should require receipts. If an endpoint wants 5 USDC, it should expect the router to ask for proof before payment.

ERC-8004 gives the agent a name.

Reputation starts when that name has receipts attached to work someone paid for.