Models
Nimblesite is provider-agnostic. You select a model in your agent config, and the platform handles routing, key management, billing, and history replay against the provider for you.
Switching providers
This is the whole story:
{
"model_config": {
"provider": "anthropic",
"model": "claude-sonnet-4-6"
}
}
Change it to:
{
"model_config": {
"provider": "moonshotai",
"model": "kimi-k2-0905-preview"
}
}
Your app code doesn't change. Your tool contract doesn't change. The agent's memory persists. The integration is identical.
Supported providers
A model is callable iff it has a published rate on its vendor's pricing page. Anything else returns 503 pricing_unavailable from the chat path — there are no hidden rates, and there is no manual price entry.
| Provider | Example models |
|---|---|
anthropic |
claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5-20251001 |
moonshotai |
kimi-k2.6, kimi-k2-0905-preview, kimi-k2-thinking, moonshot-v1-128k |
ollama |
Any model running on an Ollama instance you operate |
New providers land as their vendors publish accessible pricing pages. If you need one we don't list, open an issue. Per-token rates are on the pricing page.
Inference parameters
The model_config block accepts standard inference parameters:
{
"model_config": {
"provider": "anthropic",
"model": "claude-sonnet-4-6",
"temperature": 0.7,
"max_tokens": 4096,
"top_p": 0.9
}
}
Unknown fields are passed through to the underlying provider. If a provider supports something exotic, you can use it.
How inference is charged
Nimblesite holds the provider keys and runs the inference for you, then deducts the per-token rate published on the pricing page from your prepaid wallet. You never manage ANTHROPIC_API_KEY / MOONSHOTAI_API_KEY yourself, you never see a separate bill from those providers, and you never have to update keys when you switch model. One prepaid balance, every supported model, no post-pay overages.
What the per-token rate covers:
- The chat → tool-call → result loop, persisted per tenant.
- Persistent conversation memory scoped to your tenant — no third-party vector store, no replay loop you have to write.
- Multi-tenant isolation enforced server-side on every request.
- Sandbox provisioning and lifecycle for sandboxed agents.
- Ongoing tracking of new providers, model SKUs, and pricing changes.
Local development with Ollama
For zero-cost local development, point provider: "ollama" at a local Ollama instance:
{
"model_config": {
"provider": "ollama",
"model": "llama3.2"
}
}
Set OLLAMA_HOST=http://localhost:11434 in your environment and the platform talks to the local instance. Great for CI, offline work, and testing without burning tokens.
Model churn is our job
The model landscape moves every week. A new frontier model, a new SDK, a new pricing tier, a new tool-call schema. We track that so you don't have to. When a new model lands and the vendor publishes its rate, it becomes available in Nimblesite as a one-line config change — no refactor of your app, no breaking API, no migration.
That is a big part of what you're paying for.
Every supported model behind one key. Sign up free →