Prepaid, predictable Agents as a Service

Build agentic products.
We provide the platform.

One JSON config in. One HTTP endpoint out. Memory, tools, tenancy, templates, every major model, and prepaid spend controls included. Use the HTTP API directly today from OpenAPI, Swagger UI, or ReDoc; SDKs for every major platform will be released.

agent.config.json
01 POST /api/v1/configs
{
  "name": "Site Editor",
  "model": "claude-sonnet-4-6",
  "tools": ["read_file", "write_file"]
}
check_circle Awaiting fulfillment...

The production foundation every agentic feature needs.

Stop rebuilding the agent loop for every product. We handle memory, tenancy, tool dispatch, model routing, usage controls, and workspace lifecycle.

database

Stateful, by default

Pass a conversation_id and we already know what happened. No messages table. No replay loop. No token-budget management. Memory just works.

build

Two ways to run tools

Client-side tools — agent emits tool calls, your app executes them in your own trust boundary. Or flip one config flag and we run a managed sandbox (filesystem, shell, build pipeline) per agent.

hub

Every model, one API

Claude, GPT, Gemini, Ollama, DeepSeek — same JSON config, same endpoint. New frontier model drops next week? Change one line. Your code never moves.

groups

Keys for segregated usage

Use separate user-bound API keys for prod, staging, CI, partners, or customer streams. Rotate one key without disrupting the whole tenant.

account_balance_wallet

No overspend by design

Prepaid balance is the hard gate. If a chat turn or sandbox operation cannot be funded, it stops before vendor cost is incurred.

bolt

HTTP API today, SDKs next

POST a config once. POST chat messages forever. Use /openapi.json, Swagger UI at /docs, or ReDoc at /redoc today. SDKs for every major platform will be released.

Three layers, three jobs.

Nimblesite sits between raw LLM APIs and DIY frameworks to provide a production-ready agent foundation with prepaid usage controls.

Capability
LLM APIs
Frameworks
Nimblesite
Conversation memory
Ephemeral
Manual (Redis/Vector)
check_circle Built-in
Tool dispatch
Raw JSON only
Boilerplate heavy
check_circle Declarative
Multi-tenancy
Stateless
Build it yourself
check_circle Tenant-scoped
Spend control
Provider bill shock
Build it yourself
check_circle Prepaid gate
Implementation
100+ lines
500+ lines
check_circle 2 HTTP calls

Two execution modes. One API.

Pick per agent how the tools get executed. Memory, model picker, multi-tenancy, and chat contract stay identical.

smartphone Default

Client-side tools

The agent decides what to call. Your app runs it. Same pattern as OpenAI function calling and Anthropic tool use — but persisted, multi-tenant, and stateful out of the box.

  • check Tools run in your existing backend, mobile app, or CLI.
  • check Your data, your APIs, your secrets — never touched by us.
  • check SDKs are optional. A tool is just a function in your app.
terminal New

Sandboxed agents

We provision a managed sandbox per agent — filesystem, shell, network, build pipeline. Same idea as Code Interpreter or E2B, with conversation memory and the model picker wired in.

  • check Durable per-config workspace, persists across conversations.
  • check Out-of-band SSH / web-terminal access for your users.
  • check One HTTP turn per reply — no client-side tool loop to render.
Compare execution modes →

Architectural integrity

Built for engineers who want a stable OpenAPI contract today and typed SDKs across every major platform tomorrow.

5

minutes to first agent

security

Hard tenant isolation

Every row is tenant-scoped. API keys are revocable and user-bound. The prepaid balance is the spend ceiling.

Every model, one API.
Anthropic Claude OpenAI GPT Google Gemini Ollama DeepSeek

Questions you're already asking.

I can already call OpenAI directly. Why do I need this?

add

Because raw LLM APIs are stateless. Every call is a fresh slate. To turn that into a real assistant, you have to store every message, replay the history on every call, parse tool calls, dispatch them, and loop until the model is done — per tenant, per conversation. That whole stack is the agent loop. We run it for you.

How do you handle conversation history?

add

Persisted by Nimblesite, scoped per conversation and per tenant. You pass a conversation_id on follow-up chat calls; we load the prior turns, hand them to the model, store the new turns, and return the response. The full transcript is readable any time via GET /api/v1/conversations/{conversation_id} — you never run a database for it.

How do you prevent overspend?

add

Nimblesite is prepaid first. Billable chat and workspace operations check the tenant balance before vendor cost is incurred. Empty balance means 402 Payment Required, not a surprise invoice.

Do you have SDKs?

add

You can call the HTTP API directly today using /openapi.json, Swagger UI at /docs, or ReDoc at /redoc. Official SDKs for every major platform will be released.

Is TMC Cloud integrated?

add

Not yet. TMC Cloud coordination is coming, and the plan is for that cloud functionality to become part of nimblesite.ai. Until then, Nimblesite and TMC Cloud remain separate.

Ship your first agent today.

Create an account, grab an API key, and POST your first config in under five minutes. Prepaid — you only pay for what you use.