Deployment
Deploy SHIPIT Agent to production with confidence. This guide covers containerization, scaling, monitoring, and best practices for running agents in production environments.
Docker Deployment
The simplest production deployment uses Docker:
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "-m", "shipit_agent.server"]docker build -t shipit-agent .
docker run -p 8000:8000 --env-file .env shipit-agentEnvironment Variables
Production environments should set these variables:
| Variable | Required | Description |
|---|---|---|
SHIPIT_LLM_PROVIDER | ✅ | LLM provider (openai, anthropic, bedrock, etc.) |
SHIPIT_LOG_LEVEL | ❌ | Logging verbosity (DEBUG, INFO, WARNING) |
SHIPIT_MAX_ITERATIONS | ❌ | Safety limit for agent loops (default: 25) |
SHIPIT_TIMEOUT_SECONDS | ❌ | Per-run timeout (default: 300) |
SHIPIT_MEMORY_BACKEND | ❌ | Memory storage (file, redis, postgres) |
Provider-specific keys (e.g., OPENAI_API_KEY, AWS_REGION_NAME) are also required depending on your chosen provider.
FastAPI Integration
Run agents behind a production-grade API:
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
from shipit_agent import Agent
from shipit_agent.llms import build_llm_from_env
app = FastAPI()
llm = build_llm_from_env()
@app.post("/run")
async def run_agent(prompt: str):
agent = Agent.with_builtins(llm=llm)
result = agent.run(prompt)
return {"output": result.output, "steps": result.step_count}
@app.post("/stream")
async def stream_agent(prompt: str):
agent = Agent.with_builtins(llm=llm)
async def generate():
for event in agent.stream(prompt):
yield f"data: {event.model_dump_json()}\n\n"
return StreamingResponse(generate(), media_type="text/event-stream")Scaling Considerations
Horizontal Scaling
Each agent run is stateless by default. Scale horizontally by running multiple instances behind a load balancer:
# docker-compose.yml
services:
agent:
image: shipit-agent
deploy:
replicas: 4
environment:
- SHIPIT_LLM_PROVIDER=bedrock
- AWS_REGION_NAME=us-east-1Memory and State
For stateful agents (sessions, memory), use an external store:
from shipit_agent import Agent
from shipit_agent.memory import RedisMemory
memory = RedisMemory(url="redis://localhost:6379")
agent = Agent.with_builtins(llm=llm, memory=memory)Rate Limiting
Use hooks to enforce rate limits per user or API key:
from shipit_agent import AgentHooks
hooks = AgentHooks()
@hooks.on_before_llm
def rate_limit(context):
if context.metadata.get("calls_this_minute", 0) > 60:
raise RateLimitError("Too many requests")Monitoring
Structured Logging
Every agent event is a structured object. Pipe events to your observability stack:
import json
import logging
logger = logging.getLogger("shipit")
for event in agent.stream(prompt):
logger.info(json.dumps({
"type": event.type,
"message": event.message,
"timestamp": event.timestamp.isoformat(),
"step": event.step_number,
}))Cost Tracking
Use the on_after_llm hook to track token usage and costs:
@hooks.on_after_llm
def track_cost(context, response):
tokens = response.usage
cost = calculate_cost(tokens, context.llm.model)
metrics.increment("llm_cost", cost)
metrics.increment("llm_tokens", tokens.total)Security Checklist
Before deploying to production:
- [] API keys are stored in environment variables, never in code
- [] Agent timeout is configured (
SHIPIT_TIMEOUT_SECONDS) - [] Max iterations are limited (
SHIPIT_MAX_ITERATIONS) - [] Tool permissions are explicitly configured
- [] File system access is sandboxed (if using file tools)
- [] Network access is restricted to required domains
- [] Logging does not capture sensitive user data
- [] Rate limiting is enabled for public-facing endpoints
Next Steps
- Error Recovery — handle failures gracefully in production
- Hooks & Middleware — add monitoring, rate limiting, and guardrails
- Async Runtime — use with FastAPI and modern async Python