Pay only for what your agents actually use.
Top up first. Every chat turn and every billable container second is metered against your prepaid balance. When the balance hits zero, the next call returns 402 Payment Required — no post-pay overages, no surprise bills.
Fast & cheap models
Most agent traffic lives here. Snappy, low-cost, good enough for 80% of turns.
- bolt Moonshot v1 (8k) — $0.40 in / $4.00 out per 1M tokens
- bolt Moonshot v1 (8k, vision) — $0.40 in / $4.00 out per 1M tokens
- bolt Kimi K2 (0711 preview) — $1.20 in / $5.00 out per 1M tokens
- bolt Kimi K2 (0905 preview) — $1.20 in / $5.00 out per 1M tokens
- bolt Kimi K2 Thinking — $1.20 in / $5.00 out per 1M tokens
- bolt Moonshot v1 (32k) — $2.00 in / $6.00 out per 1M tokens
- bolt Moonshot v1 (32k, vision) — $2.00 in / $6.00 out per 1M tokens
- bolt Kimi K2.6 — $1.90 in / $8.00 out per 1M tokens
- bolt Claude Haiku 4.5 — $2.00 in / $10.00 out per 1M tokens
- bolt Moonshot v1 (128k) — $4.00 in / $10.00 out per 1M tokens
- bolt Moonshot v1 (128k, vision) — $4.00 in / $10.00 out per 1M tokens
Frontier models
When the task actually needs a brain. Same wallet, same endpoint — just a different config field.
- psychology Kimi K2 Thinking Turbo — $2.30 in / $16.00 out per 1M tokens
- psychology Kimi K2 Turbo (preview) — $2.30 in / $16.00 out per 1M tokens
- psychology Claude Sonnet 4.6 — $6.00 in / $30.00 out per 1M tokens
- psychology Claude Opus 4.7 — $10.00 in / $50.00 out per 1M tokens
Sandbox containers
Per-second compute for sandboxed agents — filesystem, shell, build pipeline included.
- memory shared-cpu-1x · 256 MB — $0.02/hr
- memory shared-cpu-2x · 512 MB — $0.03/hr
- memory shared-cpu-4x · 1 GB — $0.06/hr
- pause_circle Auto-pauses after 10 min idle. Resumes on the next chat.
- info Billed per second; only while the sandbox is running.
How metered usage works
| Step | What happens | What it costs |
|---|---|---|
| 1. Top up | Add funds to your wallet via Stripe. Whole-dollar amounts, $5 minimum. | Face value. No fee on top-up. |
| 2. Segment keys | Use separate user-bound API keys for prod, staging, CI, partner, or customer-specific traffic. | Same prepaid tenant balance; one hard cap. |
| 3. Chat | Each turn reserves the maximum it could cost, runs the model, then refunds the unused portion. | Tokens × the published per-model rate above. |
| 4. Sandbox | Sandboxed agents reserve their session cap on spin-up; actual seconds settle as the container runs. | Seconds × the per-machine rate above. |
| 5. Stop | No traffic, no debit. The wallet just sits there until your next call. | $0 |
| Empty wallet | The next chat or sandbox call returns 402 Payment Required before any additional cost is incurred. |
Nothing — the call doesn't run. |
Frequently Asked Questions
Are there monthly plans?
add
Today the public pricing model is prepaid metered usage. Monthly prepaid plans or included-usage bundles may come later, but they will not introduce post-pay overages. The wallet remains the hard stop.
What's included in the per-token rate?
add
Everything you'd otherwise wire up yourself: persistent per-tenant conversation memory, multi-tenant isolation, the chat → tool-call → result loop, sandbox provisioning and lifecycle, the model abstraction layer, and ongoing tracking of new providers and model SKUs. One published rate per model — what you see on this page is what gets metered.
How do I try it without risking spend?
add
Prepaid credit is the safety rail. Early accounts can receive starter credit so you can build and test against a fixed cap. Email us with what you're building and we'll set you up.
Can I separate usage by API key?
add
Yes. Use separate user-bound API keys for prod, staging, CI, partners, or customer streams. The tenant balance is still the shared hard cap, so segregated traffic does not create a new overspend path.
Do you handle billing for my customers?
add
No. Nimblesite meters and charges your Nimblesite account for platform usage. Your own subscriptions, invoices, credits, margins, and pass-through pricing stay in your product.
What happens when my wallet hits zero?
add
The next chat or workspace call returns 402 Payment Required. No additional cost is incurred — we never let a request run that the wallet can't fund. Top up and your existing conversations resume exactly where they left off; nothing is lost.
Where can I see what I've spent?
add
The dashboard shows your live wallet balance, lifetime top-up, and lifetime spend. Every reservation, settlement, refund, and top-up is in an append-only ledger you can paginate through via GET /api/v1/wallet/ledger. Nothing is hidden.
Do unused tokens or credits expire?
add
No. The wallet balance is yours until you spend it. There is no monthly reset.
Can I self-host?
add
No. Nimblesite is a proprietary hosted API — there is no public source and no self-host path. Every integration calls api.nimblesite.ai with an API key. Enterprise customers who need data residency can discuss a dedicated single-tenant deployment; contact sales.
Build against a fixed cap.
Create your account, top up a small balance, and see exactly what metered prepaid usage looks like on real traffic. No card needed to sign up.