Token-Router.org

The economy

Earn the inference you spend.

Run a node — a GPU you own, Apple Silicon, or a cloud account — and earn credits on every token it serves. Spend those same credits back through an OpenAI-compatible gateway on faster, pooled inference. Credits are shared between providers and the platform, and the platform funds bounties that drive more nodes toward the models that need it most.

How credits move

Credits flow where the work is.

When a consumer spends credits, the network shares them between the serving node and the platform. The platform then funds bounties that drive more nodes to the models that need it most.

ConsumerToken-RouterProviderPlatformPaid callEarnsFunds bountiesbounty
Starting credit$5

On every new account, so you can call the gateway before you buy or earn.

Same terms for every accountOne deal

Every paid call shares credits between the provider that served it and the platform. No per-node negotiation. No enterprise rate cards.

Where the platform's share goesBounties

Payout multipliers on under-supplied models, funded by the platform — not the consumer. Bounties direct compute to the models that need it most.

Routing

Least-loaded healthy candidate.

Fairness is availability-based, not a performance leaderboard. Short hiccups don't park you; bad consumer input doesn't count.

requestgatewayA55%B22%C68%D34%E50%

Least-loaded selection

The gateway finds every active instance serving the requested model and picks the one with the most spare capacity. Ties are broken randomly so equally-idle nodes share traffic over time.

Forgiving breaker

Three failures in 60 seconds parks you for 15 seconds; a single probe request lets you fully back in. 4xx user errors are neutral — your reputation is untouched when consumers send bad input.

A 5 tok/s floor

A minimum sustained generation throughput preserves consumer UX. Above the floor, a 200 tok/s H100 and a 6 tok/s consumer GPU are equally fair candidates — routing is by availability, not ranking.

Long-term operators

Stay around and you earn more.

None of these mechanisms punish newcomers. They reward not leaving.

  1. 01

    Rolling health window

    A 14-day success / failure / neutral score that tie-breaks routing among least-loaded candidates. Long-running nodes accumulate a visible signal.

  2. 02

    Ramp-up for new instances

    New nodes are admitted at a fraction of their declared rate and scale to 100% over the first week. A Sybil deterrent — not a tax on real operators.

  3. 03

    Uptime credit

    Probe outcomes and accumulated online time roll into the same window. Time spent answering counts.

  4. 04

    Anti-flapping on model switches

    Switching the model an instance serves triggers a temporary payout reduction that decays over 72 hours, so coverage stays stable and operators don't chase whichever bounty is largest week-to-week.

Trust unlocks higher bounty multipliers. Long-term honest participation is, by construction, more profitable than short-term spoofing.

Two paths to earn

Run a node, your way.

On hardware you own, or on cloud GPUs you already pay for. The network pays you on every token your node serves — and the credits you earn are the same credits you spend on your own inference.

Run a node on hardware you own

Turn your own machine into a node.

A discrete GPU, a CPU box, a TPU, or Apple Silicon — anything that can serve an open model is a node. Register it from the dashboard and the router sends paid traffic, crediting your account on every token it generates.

Run a node on a cloud account

Or use compute you already pay for.

Spin up a node on any cloud GPU service of your choice and register it with that provider's API token. We encrypt the token at rest, forward your calls, and pay you on every served token — your cloud bill, your margin.

$5 in starting credits lets you call the gateway from day one — before your first paid token.

Either side. Same account.

Start earning. Start calling.

Run a node — a GPU you own, Apple Silicon, or a cloud account you already pay for. One balance — the credits you earn are the credits you spend on faster, pooled inference.