Nimblesite vs raw LLM APIs

Short version: Raw LLM APIs are stateless. Every call is a fresh start. Nimblesite is the agent that wraps all of that — memory, loop, tool dispatch, multi-tenancy — already wired up.

What a raw LLM call looks like

Here's what you write when you talk to Claude or GPT directly:

# Every call is a fresh start. You send the full history every time.
history = load_messages_from_db(conversation_id)
history.append({"role": "user", "content": "Hello"})

response = anthropic.messages.create(
    model="claude-sonnet-4-6",
    messages=history,
    tools=my_tools,
)

# If the response is a tool call, parse it, run it,
# append the result to history, and call again.
while response.stop_reason == "tool_use":
    for block in response.content:
        if block.type == "tool_use":
            result = run_tool(block.name, block.input)
            history.append({"role": "assistant", "content": response.content})
            history.append({
                "role": "user",
                "content": [{
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": result,
                }],
            })

    response = anthropic.messages.create(
        model="claude-sonnet-4-6",
        messages=history,
        tools=my_tools,
    )

# Save the final history back to the database.
save_messages_to_db(conversation_id, history)

return response.content[-1].text

Now multiply that by:

Multi-tenancy (each tenant has their own API key and their own messages table scope)
Prompt templating (each tenant has their own system prompt)
Provider failover (what if Anthropic is down?)
Token budget management (what if the history is longer than the context window?)
Logging, audit, and replay
Handling new SDK versions when the provider ships a breaking change

What the same thing looks like on Nimblesite

curl -X POST https://api.nimblesite.dev/api/v1/chat/$CONFIG_ID \
  -H "X-API-Key: $TENANT_KEY" \
  -d '{"message": "Hello"}'

That's it. Memory, loop, tool dispatch, multi-tenancy, prompt templating, provider failover, logging, audit — all already done.

The honest comparison

Dimension	Raw LLM API	Nimblesite
State	You build it	Built in
Agent loop	You run it	We run it
Tool dispatch	You parse & loop	We parse, you execute
Multi-tenancy	You build it	Built in
Prompt templating	You build it	Built in
Model switching	Rewrite your code	Edit one JSON field
Provider churn	Your problem	Our problem
SDK maintenance	Forever	None

When a raw LLM API is a better fit

You're making one-shot completion calls with no memory and no tools
You need absolute control over the exact bytes on the wire to the provider
You're implementing novel agent architectures that don't fit a standard loop

When Nimblesite is a better fit

Everything else.