FAQ
Frequently asked questions about shipit-agent — installation, providers, tools, streaming, troubleshooting, and production deployment.
General
What is shipit-agent for?
Building production agents with tools, MCP servers, streaming reasoning events, and clean provider switching. It's a library, not a framework — the runtime is small (~400 lines), every component is pluggable, and there's no hidden ceremony. Bring your own LLM, your own tools, your own storage; the runtime composes them.
How is it different from LangChain / LangGraph / CrewAI?
- Smaller surface area —
shipit_agent/runtime.pyis one file you can read in a sitting - No abstraction tax — no chains, no graphs, no mandatory inheritance hierarchies
- Reasoning events as first-class citizens — automatic extraction from any provider
- Bedrock tool-pairing guarantees — multi-iteration loops on AWS Bedrock just work
- Truly incremental streaming — events arrive the instant they're emitted, not buffered until completion
- Zero mandatory deps — every LLM SDK is an optional extra
We aren't trying to replace those libraries. If you need a complex DAG of agents with cycles and state machines, LangGraph is great. If you need a multi-agent role-playing simulation, CrewAI is great. shipit-agent is for the case where you want a clean, observable, single-agent runtime that does exactly what you tell it.
Is it production-ready?
v1.0.1 is the current stable release. Used in production for research agents, customer support bots, and code-review workflows. 92 tests, full CI, gitleaks secret scanning, MIT licensed.
If you find a bug, open an issue.
What's the license?
MIT. Use it however you like.
Installation
Which extras should I install?
| Use case | Install |
|---|---|
| Just trying it out with OpenAI | pip install 'shipit-agent[openai]' |
| Using AWS Bedrock | pip install 'shipit-agent[litellm]' |
| Using Claude directly | pip install 'shipit-agent[anthropic]' |
| Need browser-based URL fetching | pip install 'shipit-agent[playwright]' |
| Want everything | pip install 'shipit-agent[all]' |
pip install works but python -c "import shipit_agent" says ModuleNotFoundError
You probably installed against a different Python interpreter than the one you're running. Try:
which python python3 python3.11
python3.11 -m pip install shipit-agent
python3.11 -c "import shipit_agent; print(shipit_agent.__version__)"Can I use it with Python 3.10?
No. shipit-agent requires Python 3.11+. The runtime uses several 3.11-only features (StrEnum, improved tomllib, Self type, structural pattern matching enhancements). Upgrading to 3.11 is the only fix.
Providers
Which provider should I start with?
openai with gpt-4o-mini if you have an OpenAI API key — cheapest, fastest, most reliable.
bedrock with openai.gpt-oss-120b-1:0 if you have AWS credentials — cheapest reasoning-capable model.
ollama with llama3.1 if you want everything local and free.
How do I switch providers?
One line in .env:
SHIPIT_LLM_PROVIDER=openai # or anthropic, bedrock, gemini, vertex, groq, together, ollama, litellmNo code change needed. build_llm_from_env() reads this var and constructs the right adapter automatically.
Why does my agent run with reasoning events on one provider but not another?
Because reasoning content is provider-specific. Models that surface reasoning blocks (OpenAI o-series, Claude with extended thinking, Bedrock gpt-oss, DeepSeek R1) emit them; models that don't (gpt-4o, gpt-4o-mini, Llama 3, Gemini 1.5) won't. shipit-agent extracts whatever the model actually returns — there's no way to add reasoning to a model that doesn't produce it.
See the reasoning guide for the full compatibility matrix.
How do I use my own LiteLLM proxy server?
Set these in .env:
SHIPIT_LLM_PROVIDER=litellm
SHIPIT_LITELLM_MODEL=gpt-4o-mini # whatever model the proxy routes to
SHIPIT_LITELLM_API_BASE=https://litellm.my-company.internal
SHIPIT_LITELLM_API_KEY=sk-proxy-tokenbuild_llm_from_env() auto-detects proxy mode when SHIPIT_LITELLM_API_BASE is set and uses LiteLLMProxyChatLLM instead of direct LiteLLM SDK mode.
Vertex AI service-account file?
Set up via SHIPIT_VERTEX_CREDENTIALS_FILE:
SHIPIT_LLM_PROVIDER=vertex
SHIPIT_VERTEX_CREDENTIALS_FILE=/path/to/sa.json
VERTEXAI_PROJECT=my-gcp-project
VERTEXAI_LOCATION=us-central1The adapter sets GOOGLE_APPLICATION_CREDENTIALS automatically so google-auth picks it up.
Agent types — which one should I use?
shipit_agent ships seven agent classes. Pick the one that matches your task shape:
| Agent | When to use it | Run mode | Docs |
|---|---|---|---|
Agent | Direct tool use, no planning ceremony, fastest | agent.run(prompt) | Quickstart |
DeepAgent | Long, multi-step tasks where planning + verification + workspace pay off. The default for serious work. | deep.run(prompt) | DeepAgent |
GoalAgent | A goal with explicit success criteria — decompose, execute, self-evaluate | goal_agent.run() (no prompt) | GoalAgent |
ReflectiveAgent | Generate → critique → revise loop until quality threshold met | reflective.run(prompt) | ReflectiveAgent |
AdaptiveAgent | The agent should write new tools at runtime as it works | adaptive.run(prompt) | AdaptiveAgent |
Supervisor | Coordinate multiple specialised worker agents | supervisor.run(task) | Supervisor |
PersistentAgent | Long-running task that must survive crashes via checkpoints | persistent.run(task, agent_id=...) | PersistentAgent |
Rule of thumb: start with Agent.with_builtins(llm=llm). When the task starts to feel too long for a single linear run, switch to DeepAgent.with_builtins(llm=llm) — you keep all the same tools and gain planning, workspace, sub-agent delegation, and the option to enable verification or reflection with one extra flag.
Can I chat with a deep agent live?
Yes — two ways:
-
Terminal REPL (one command):
bashshipit chat # default: DeepAgent shipit chat --agent goal # any other agent type shipit chat --rag-file docs/manual.pdf --reflect --verifySee DeepAgent docs for the full slash-command list.
-
Programmatically:
pythonchat = deep_agent.chat(session_id="user-42") for event in chat.stream("Hi"): print(event.message) result = chat.send("Anything else I should know?")
How is DeepAgent different from LangChain's create_deep_agent?
It's a strict superset. Same one-liner DX, more capabilities — see the full comparison table. Highlights:
- 7 deep tools vs LangChain's 3
rag=,verify=True,reflect=True,goal=Goal(...)as one-liner flags- A multi-agent live chat REPL (
shipit chat) that works for every agent type, not justDeepAgent
Where do I find every constructor parameter?
The Parameters Reference page lists every constructor parameter for every agent type and key class — Agent, DeepAgent, GoalAgent, ReflectiveAgent, Supervisor, RAG, and more — with types, defaults, and "use this when" notes.
Tools
My agent has 28 tools and uses too many tokens per turn. What do I do?
Add tool_search to the registry and instruct the model to use it first:
prompt = (
"Before calling any other tool, first call `tool_search` to confirm "
"which tool fits. Then proceed."
)The model will get a ranked shortlist of 5 relevant tools instead of seeing all 28. See the tool search guide.
How do I write a custom tool?
Three things: name, schema(), run(context, **kwargs). See custom tools guide and the worked example in examples/05_custom_tool.py.
Why does my tool's run() get TypeError: got multiple values for argument 'context'?
You hit the v1.0.0 bug. Upgrade to v1.0.1+ — pip install -U shipit-agent — and the runtime now strips context and self from tool-call arguments before forwarding so there's no collision.
Can my tool be async?
Tool run() is sync because the runtime treats every tool call as atomic. If your tool needs async work internally (e.g. concurrent HTTP requests), wrap it with asyncio.run():
def run(self, context, **kwargs):
import asyncio
result = asyncio.run(self._async_work(**kwargs))
return ToolOutput(text=result, metadata={...})For long-running async I/O, consider running the tool work in a thread pool to avoid blocking the runtime's background thread.
How do I disable the planner?
from shipit_agent.policies import RouterPolicy
agent.router_policy = RouterPolicy(auto_plan=False)The plan_task tool stays available in the registry — the model can still call it explicitly — but the runtime won't auto-invoke it before the first LLM call.
Streaming
agent.stream() returns all events at once at the end. Why isn't it streaming?
This was a real bug in pre-1.0 versions. Upgrade to v1.0.1+ — agent.stream() now uses a background thread with a queue.Queue so events are yielded the instant they're emitted.
If you're on 1.0.1+ and still seeing buffering, check whether your terminal is line-buffering output. In Jupyter/VS Code/JupyterLab, use clear_output(wait=True) + display(...) for reliable incremental rendering.
How do I render a "Thinking" panel from reasoning events?
for event in agent.stream(prompt):
if event.type == "reasoning_started":
print(f"🧠 Iteration {event.payload['iteration']} — thinking…")
elif event.type == "reasoning_completed":
print(f"🧠 Thought: {event.payload['content']}")
elif event.type == "tool_called":
print(f"▶ Calling {event.message}")Run examples/02_streaming_with_reasoning.py for a colored terminal renderer you can copy.
Token-level reasoning streaming?
Not in v1.0. The current LLM adapters are non-streaming (llm.complete() returns once per iteration), so reasoning arrives as a single event per iteration with the full content. Token-level streaming requires adding a .stream() method to the LLM adapters and consuming chunks in the runtime — planned for a future release.
Bedrock
Why does my Bedrock agent crash on the third iteration with "toolResult blocks exceed toolUse blocks"?
This was a real bug in pre-1.0.1 versions. Upgrade to v1.0.1+ — the runtime now guarantees every toolUse block gets a paired toolResult, even when:
- A tool fails (synthetic error result)
- The model hallucinates an unknown tool name (synthetic "tool not registered" result)
- The planner runs (output injected as user-role context, not orphan tool-result)
If you're on 1.0.1+ and still seeing the error, open an issue with the full traceback.
Which Bedrock model should I use?
bedrock/openai.gpt-oss-120b-1:0— cheap, surfaces reasoning blocks, supports tool callingbedrock/anthropic.claude-3-5-sonnet-20241022-v2:0— more capable, more expensive, supports extended thinking via LiteLLMbedrock/meta.llama3-3-70b-instruct-v1:0— fast and cheap, no reasoning, weaker tool calling
Do I need boto3 installed?
Not directly. BedrockChatLLM uses LiteLLM under the hood, which has its own AWS client. You only need:
pip install 'shipit-agent[litellm]'Plus your AWS credentials in env vars (AWS_REGION_NAME, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) or via AWS_PROFILE.
Production deployment
Is code_execution safe for untrusted prompts?
No. CodeExecutionTool runs untrusted Python in a subprocess that inherits your environment, can read/write the workspace, and can make network requests. For production with untrusted input:
- Run shipit-agent inside a Docker container with no host filesystem mounts
- Restrict the workspace to a tmpfs volume
- Drop network access from the container
- Set
allow_shell=False - Consider running each agent invocation in a fresh container
For local dev and trusted internal use, the default config is fine.
How do I prevent tool-call infinite loops?
agent.max_iterations = N (default 4). If the cap is reached while the model is still calling tools, the runtime gives it one more turn with tools=[] to force a final summary, so run_completed is never empty.
How do I store credentials securely?
Use FileCredentialStore for local dev (file is read-only to the user, gitignored), or implement a custom CredentialStore backed by AWS Secrets Manager / HashiCorp Vault / Doppler / your secret store of choice. The protocol is two methods:
class CredentialStore(Protocol):
def get(self, name: str) -> CredentialRecord | None: ...
def put(self, record: CredentialRecord) -> None: ...How do I monitor production runs?
Use FileTraceStore to capture every event with timestamps, then ship the trace files to your observability stack:
from shipit_agent import Agent, FileTraceStore
agent = Agent(
llm=llm,
trace_store=FileTraceStore(root="/var/log/shipit/traces"),
trace_id=f"run-{request_id}",
)For real-time observability, write a custom TraceStore that ships events to Datadog / Honeycomb / OpenTelemetry as they're emitted.
Troubleshooting
"Missing environment variable for X" — what now?
The error tells you exactly which env var to set. Add it to your .env file or shell environment, then re-run. build_llm_from_env() walks upward from CWD to find .env so it works regardless of which directory you launch from.
gpt-4o-mini describes a plan instead of calling tools
gpt-4o-mini is notoriously lazy about tool use. Three fixes:
- Force tool use:
OpenAIChatLLM(model="gpt-4o-mini", tool_choice="required") - Disable the planner (it primes the model to describe instead of execute):
agent.router_policy = RouterPolicy(auto_plan=False) - Use a stronger model:
gpt-4o,o3-mini, or any reasoning-capable model
The docs site doesn't update after I push to main
Check https://github.com/shipiit/shipit_agent/actions and look for the Docs workflow. If it's failing, click into the failed run for the error. Common causes:
- Missing
mkdocs.ymlchange pushed alongside docs files - New page added to
docs/but not added tomkdocs.ymlnav mkdocs build --strictfailing on a broken link — fix the link and re-push
Gitleaks CI fails on a runtime file
If gitleaks flags a file under .shipit_notebooks/, sessions/, traces/, or memory.json, it's a false positive — those files contain scraped tool outputs (e.g. Pushly client-side IDs from CoinDesk pages) that look like API keys to the rule engine. The v1.0.1 .gitleaks.toml allowlist covers these paths. If a new false-positive pattern surfaces, add it to the regexes list in .gitleaks.toml.
Contributing
How do I submit a fix?
See CONTRIBUTING.md. TL;DR:
- Fork + branch
make install-hooks- Make your change
make check(lint + test + gitleaks)- Open a PR
How do I cut a release? (maintainers only)
make new-release VERSION=1.0.2 # bumps + tests + builds
# review the diff, edit CHANGELOG
git add -A && git commit -m "release: v1.0.2"
git push origin main
make tag # creates and pushes v1.0.2 git tag
make publish # uploads to PyPI (asks for confirmation)
make github-release # creates GitHub Release with notes + dist filesOr one shot: make ship-it.