Prepaid, predictable, metered

Pay only for what your agents actually use.

Top up first. Every chat turn and every billable container second is metered against your prepaid balance. When the balance hits zero, the next call returns 402 Payment Required — no post-pay overages, no surprise bills.

Fast & cheap models

Most agent traffic lives here. Snappy, low-cost, good enough for 80% of turns.

from $0.40 / 1M input tokens
  • bolt Moonshot v1 (8k) — $0.40 in / $4.00 out per 1M tokens
  • bolt Moonshot v1 (8k, vision) — $0.40 in / $4.00 out per 1M tokens
  • bolt Kimi K2 (0711 preview) — $1.20 in / $5.00 out per 1M tokens
  • bolt Kimi K2 (0905 preview) — $1.20 in / $5.00 out per 1M tokens
  • bolt Kimi K2 Thinking — $1.20 in / $5.00 out per 1M tokens
  • bolt Moonshot v1 (32k) — $2.00 in / $6.00 out per 1M tokens
  • bolt Moonshot v1 (32k, vision) — $2.00 in / $6.00 out per 1M tokens
  • bolt Kimi K2.6 — $1.90 in / $8.00 out per 1M tokens
  • bolt Claude Haiku 4.5 — $2.00 in / $10.00 out per 1M tokens
  • bolt Moonshot v1 (128k) — $4.00 in / $10.00 out per 1M tokens
  • bolt Moonshot v1 (128k, vision) — $4.00 in / $10.00 out per 1M tokens
See all models

Frontier models

When the task actually needs a brain. Same wallet, same endpoint — just a different config field.

from $2.30 / 1M input tokens
  • psychology Kimi K2 Thinking Turbo — $2.30 in / $16.00 out per 1M tokens
  • psychology Kimi K2 Turbo (preview) — $2.30 in / $16.00 out per 1M tokens
  • psychology Claude Sonnet 4.6 — $6.00 in / $30.00 out per 1M tokens
  • psychology Claude Opus 4.7 — $10.00 in / $50.00 out per 1M tokens
See all models

Sandbox containers

Per-second compute for sandboxed agents — filesystem, shell, build pipeline included.

from $0.02 / hour, while running
  • memory shared-cpu-1x · 256 MB — $0.02/hr
  • memory shared-cpu-2x · 512 MB — $0.03/hr
  • memory shared-cpu-4x · 1 GB — $0.06/hr
  • pause_circle Auto-pauses after 10 min idle. Resumes on the next chat.
  • info Billed per second; only while the sandbox is running.
How sandboxes work

How metered usage works

Step What happens What it costs
1. Top up Add funds to your wallet via Stripe. Whole-dollar amounts, $5 minimum. Face value. No fee on top-up.
2. Segment keys Use separate user-bound API keys for prod, staging, CI, partner, or customer-specific traffic. Same prepaid tenant balance; one hard cap.
3. Chat Each turn reserves the maximum it could cost, runs the model, then refunds the unused portion. Tokens × the published per-model rate above.
4. Sandbox Sandboxed agents reserve their session cap on spin-up; actual seconds settle as the container runs. Seconds × the per-machine rate above.
5. Stop No traffic, no debit. The wallet just sits there until your next call. $0
Empty wallet The next chat or sandbox call returns 402 Payment Required before any additional cost is incurred. Nothing — the call doesn't run.

Frequently Asked Questions

Are there monthly plans?

add

Today the public pricing model is prepaid metered usage. Monthly prepaid plans or included-usage bundles may come later, but they will not introduce post-pay overages. The wallet remains the hard stop.

What's included in the per-token rate?

add

Everything you'd otherwise wire up yourself: persistent per-tenant conversation memory, multi-tenant isolation, the chat → tool-call → result loop, sandbox provisioning and lifecycle, the model abstraction layer, and ongoing tracking of new providers and model SKUs. One published rate per model — what you see on this page is what gets metered.

How do I try it without risking spend?

add

Prepaid credit is the safety rail. Early accounts can receive starter credit so you can build and test against a fixed cap. Email us with what you're building and we'll set you up.

Can I separate usage by API key?

add

Yes. Use separate user-bound API keys for prod, staging, CI, partners, or customer streams. The tenant balance is still the shared hard cap, so segregated traffic does not create a new overspend path.

Do you handle billing for my customers?

add

No. Nimblesite meters and charges your Nimblesite account for platform usage. Your own subscriptions, invoices, credits, margins, and pass-through pricing stay in your product.

What happens when my wallet hits zero?

add

The next chat or workspace call returns 402 Payment Required. No additional cost is incurred — we never let a request run that the wallet can't fund. Top up and your existing conversations resume exactly where they left off; nothing is lost.

Where can I see what I've spent?

add

The dashboard shows your live wallet balance, lifetime top-up, and lifetime spend. Every reservation, settlement, refund, and top-up is in an append-only ledger you can paginate through via GET /api/v1/wallet/ledger. Nothing is hidden.

Do unused tokens or credits expire?

add

No. The wallet balance is yours until you spend it. There is no monthly reset.

Can I self-host?

add

No. Nimblesite is a proprietary hosted API — there is no public source and no self-host path. Every integration calls api.nimblesite.ai with an API key. Enterprise customers who need data residency can discuss a dedicated single-tenant deployment; contact sales.

Build against a fixed cap.

Create your account, top up a small balance, and see exactly what metered prepaid usage looks like on real traffic. No card needed to sign up.