AI compute is taxed three times.
Silicon → hyperscaler → API reseller → you. Meanwhile the real supply — your GPU, your unused subscription quota, your expiring credits — is sitting idle.
- 3×+markup between silicon and end-user API
- 200M+idle AI subscriptions + personal GPUs
- 18hdaily idle time per GPU on average
- $0.02electricity per hour on idle Apple Silicon
The third path for AI compute.
Not hyperscalers. A protocol.
One protocol routes requests to four idle supply tiers — local GPUs, unused API credits, startup credits, or spare ChatGPT / Claude / Gemini subscriptions. No reseller markup, no data-center tax.
Change one line. Everything else works.
Native support across every major SDK. Streaming, function calling, tool use, multimodal — all preserved. Claude Code, Cursor, OpenAI / Anthropic SDKs: point them here and go.
from openai import OpenAI
client = OpenAI(
base_url="https://api.spareai.net/v1",
api_key="sk-spareai-…",
)
response = client.chat.completions.create(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": "Hello"}],
stream=True,
)
for chunk in response:
print(chunk.choices[0].delta.content or "", end="")Why SpareAI
Save 80%+
Claude, GPT, Gemini at 10-20% of official pricing. Open source models even cheaper.
One API, every model
OpenAI-compatible endpoint. One line to switch. Works with Claude Code, Cursor, any OpenAI SDK client.
Available everywhere
Credit card or crypto. No region locks. Developers in the Global South can afford the best AI too.
Reliable
Multi-provider automatic failover. Failed requests are never billed. Uptime on par with commercial APIs.
200+ models, one endpoint
Open source and proprietary models, all through a single OpenAI-compatible API
| Model | Input/1M | Output/1M |
|---|---|---|
| Seed 1.6 | $0.25 | $2.00 |
| Command A | $2.50 | $10.00 |
| DeepSeek V3 | $0.32 | $0.89 |
| DeepSeek V3 0324 | $0.20 | $0.77 |
| DeepSeek V3.1 Terminus | $0.21 | $0.79 |
| Gemma 4 26B A4B IT | $0.08 | $0.35 |
| Gemma 4 31B IT | $0.13 | $0.38 |
| Llama 4 Maverick | $0.15 | $0.60 |
| Llama 4 Scout | $0.08 | $0.30 |
| MiniMax 01 | $0.20 | $1.10 |
| Mistral Small 4 | $0.15 | $0.60 |
| Mistral Large 3 2512 | $0.50 | $1.50 |
| Kimi K2 0905 | $0.40 | $2.00 |
| Hermes 3 Llama 3.1 70B | $0.30 | $0.30 |
| Nemotron 3 Nano 30B A3B | $0.05 | $0.20 |
| Qwen 3 Coder Flash | $0.20 | $0.97 |
| Qwen 3 Coder Plus | $0.65 | $3.25 |
| Qwen 3.5 Flash | $0.07 | $0.26 |
| MiMo V2 Pro | $1.00 | $3.00 |
| GLM 4.6 | $0.39 | $1.90 |
203+ models available through the Spare Relay Protocol
Same model. Up to 90% off the official price.
Idle compute has near-zero marginal cost, so the savings pass through. No subscription, no minimum, pure per-token billing.
| Model | SpareAI in | SpareAI out | Official | Savings |
|---|---|---|---|---|
claude-sonnet-4-6 Anthropic flagship | $0.60 | $3.0 | $3.0 / $15.0 | −80% |
claude-opus-4-6 Anthropic reasoning | $1.0 | $5.0 | $5.0 / $25.0 | −80% |
gpt-5.4 OpenAI frontier coding | $0.50 | $3.0 | $2.5 / $15.0 | −80% |
gemini-3-pro Google multimodal | $0.40 | $2.4 | $2.0 / $12.0 | −80% |
qwen-3.5-122b-moe Open · 10B active MoE | $0.10 | $0.30 | $0.50 / $1.5 | −80% |
minimax-m2.5-239b Open · SOTA coding | $0.08 | $0.24 | $0.40 / $1.2 | −80% |
Prices per million tokens (USD). Buyers pay 20% of official; providers keep 90% of buyer payment; 10% covers protocol operations.
How it works
Providers connect capacity
GPU rigs, API keys, or idle subscriptions — one command to plug in any compute resource.
Protocol routes requests
Requests auto-match to the best provider — lowest price, lowest latency, automatic failover.
Pay per token, earn automatically
Buyers pay per token. Providers earn 90% of every request. Settled in USD or crypto.
Four ways to provide
Local GPU
Run Llama, DeepSeek, Qwen on your hardware. Set your price, earn per token.
API Key
Forward requests through your own Anthropic/OpenAI key. Keep the margin.
Enterprise Credits
Unused AI credits from accelerators or cloud programs? Turn them into cash.
Subscription
Claude Max / ChatGPT Pro / Gemini Ultra quota sitting idle? Put it to work.
Subscription sharing may violate the provider's terms of service. Use at your own risk. GPU and API key forwarding are fully compliant.
Turn your idle GPU or subscription into monthly income.
One command detects what you already have, registers it on-chain, and starts serving. No manual model lists, no port forwarding, no cloud account.
- ·Zero deps · Node 18+
- ·Background daemon auto-reconnects
- ·stop / logs / status anytime
$ npx spareai relay setup
› hub hub.spareai.net
› wallet 0x7a…e31f registered
› probe ok — 142 ms
# daemon up. earnings land in your wallet.Paid on-chain via PaySplitter to whoever's GPU, API key, credits, or subscription served the request.
Keeps the hub, scoring, settlement, and anti-fraud running. Flat. No tiers, no seats, no minimum.
This is a protocol, not a SaaS.
The Spare Relay Protocol (SRP) spec is open. Anyone can run their own Hub, their own provider, their own client. We just ship a reference implementation.
- Open spec: WebSocket handshake, auth, routing, attestation, settlement — all public
- Reference client spareai-cli is open; fork it to target your own hub
- T1–T4 supply abstraction — protocol is hardware-agnostic, any OpenAI-compatible endpoint plugs in
- No single-company lock-in — hubs can be swapped, clients can be re-implemented