The economy
Earn the inference you spend.
Run a node — a GPU you own, Apple Silicon, or a cloud account — and earn credits on every token it serves. Spend those same credits back through an OpenAI-compatible gateway on faster, pooled inference. Credits are shared between providers and the platform, and the platform funds bounties that drive more nodes toward the models that need it most.
How credits move
Credits flow where the work is.
When a consumer spends credits, the network shares them between the serving node and the platform. The platform then funds bounties that drive more nodes to the models that need it most.
On every new account, so you can call the gateway before you buy or earn.
Every paid call shares credits between the provider that served it and the platform. No per-node negotiation. No enterprise rate cards.
Payout multipliers on under-supplied models, funded by the platform — not the consumer. Bounties direct compute to the models that need it most.
Routing
Least-loaded healthy candidate.
Fairness is availability-based, not a performance leaderboard. Short hiccups don't park you; bad consumer input doesn't count.
Least-loaded selection
The gateway finds every active instance serving the requested model and picks the one with the most spare capacity. Ties are broken randomly so equally-idle nodes share traffic over time.
Forgiving breaker
Three failures in 60 seconds parks you for 15 seconds; a single probe request lets you fully back in. 4xx user errors are neutral — your reputation is untouched when consumers send bad input.
A 5 tok/s floor
A minimum sustained generation throughput preserves consumer UX. Above the floor, a 200 tok/s H100 and a 6 tok/s consumer GPU are equally fair candidates — routing is by availability, not ranking.
Long-term operators
Stay around and you earn more.
None of these mechanisms punish newcomers. They reward not leaving.
- 01
Rolling health window
A 14-day success / failure / neutral score that tie-breaks routing among least-loaded candidates. Long-running nodes accumulate a visible signal.
- 02
Ramp-up for new instances
New nodes are admitted at a fraction of their declared rate and scale to 100% over the first week. A Sybil deterrent — not a tax on real operators.
- 03
Uptime credit
Probe outcomes and accumulated online time roll into the same window. Time spent answering counts.
- 04
Anti-flapping on model switches
Switching the model an instance serves triggers a temporary payout reduction that decays over 72 hours, so coverage stays stable and operators don't chase whichever bounty is largest week-to-week.
Trust unlocks higher bounty multipliers. Long-term honest participation is, by construction, more profitable than short-term spoofing.
Two paths to earn
Run a node, your way.
On hardware you own, or on cloud GPUs you already pay for. The network pays you on every token your node serves — and the credits you earn are the same credits you spend on your own inference.
Turn your own machine into a node.
A discrete GPU, a CPU box, a TPU, or Apple Silicon — anything that can serve an open model is a node. Register it from the dashboard and the router sends paid traffic, crediting your account on every token it generates.
Or use compute you already pay for.
Spin up a node on any cloud GPU service of your choice and register it with that provider's API token. We encrypt the token at rest, forward your calls, and pay you on every served token — your cloud bill, your margin.
Either side. Same account.
Start earning. Start calling.
Run a node — a GPU you own, Apple Silicon, or a cloud account you already pay for. One balance — the credits you earn are the credits you spend on faster, pooled inference.