Overview

A clean, powerful Python agent library with tools, MCP, streaming events, reasoning capture, and runtime policies.

6 min read
7 sections
Edit this page

Reasoning visibility

Stream live thinking blocks, tool calls, and outputs without a custom runtime.

Provider flexibility

Swap LLM providers with one config change while keeping your tools intact.

Production guardrails

Built-in retry policies, error recovery, hooks, and parallel execution.

MCP and tools

Mix Python tools, remote MCP servers, and connector-style integrations.

Start building powerful agents

The Shipit framework provides everything you need to build, deploy, and scale production-ready AI agents.

v1.0.3 — Super RAG, DeepAgent, live chat REPL

New in 1.0.3: Super RAG subsystem (hybrid search, auto-cited sources), DeepAgent factory with verify / reflect / goal / sub-agents, shipit chat live multi-agent terminal REPL, and the Agent memory cookbook. 521 unit tests + 19 real-Bedrock end-to-end smoke tests, all passing. See the changelog.

SHIPIT Agent is a standalone Python agent library focused on a clean runtime:

  • bring your own LLM — or use any of seven built-in provider adapters
  • attach Python tools, remote MCP servers, or connector-style third-party tools (Gmail, Drive, Slack, Linear, Notion, Jira, Confluence)
  • attach packaged or custom skills to steer agent behavior and reusable workflows
  • iterate tool-using agents with configurable retry and router policies
  • stream structured events (including reasoning / thinking blocks) as they happen
  • inspect every step: reasoning, tool arguments, tool outputs, retries, final answer
  • compose reusable agent profiles with system prompts and tool selections locked in
  • keep clean boundaries between runtime, tools, MCP, policies, and profiles

Built for developers who want the agent loop observable, interchangeable, and out of the way.


Install

bash
pip install shipit-agent

With optional extras:

bash
pip install 'shipit-agent[openai]'         # OpenAI SDK
pip install 'shipit-agent[anthropic]'      # Anthropic SDK (native thinking blocks)
pip install 'shipit-agent[litellm]'        # LiteLLM (Bedrock, Gemini, Groq, Together, …)
pip install 'shipit-agent[playwright]'     # In-process browser for open_url and web_search
pip install 'shipit-agent[all]'            # Everything

30-second example

python
from shipit_agent import Agent
from shipit_agent.llms import OpenAIChatLLM

agent = Agent.with_builtins(llm=OpenAIChatLLM(model="gpt-4o-mini"))

for event in agent.stream("Search the web for today's Bitcoin price in USD."):
    print(event.type, event.message)

Emits events like:

bash
run_started           Agent run started
step_started          LLM completion started
reasoning_started     🧠 Model reasoning started
reasoning_completed   🧠 Model reasoning completed
tool_called           Tool called: web_search
tool_completed        Tool completed: web_search
run_completed         Agent run completed

Why SHIPIT Agent

Live reasoning events

Extended thinking blocks from o1/o3/gpt-5/Claude/gpt-oss are automatically extracted and streamed as reasoning_started / reasoning_completed events. Your UI can show a live "Thinking" panel for free.

Truly incremental streaming

agent.stream() runs the agent on a background thread and yields events through a queue as they happen. Works in Jupyter, VS Code, WebSocket, SSE, and terminals.

Bulletproof Bedrock tool pairing

Every toolUse gets a paired toolResult. Planner output is injected as user context, not orphan tool-results. Hallucinated tool names get synthetic error results. Multi-iteration Bedrock loops just work.

Semantic tool discovery

tool_search lets the agent ask "which tool should I use for X?" and get a ranked shortlist. No more 28-tool context bloat, no more tool hallucinations.

Zero-friction provider switching

Edit one line in .envSHIPIT_LLM_PROVIDER=openai — and build_llm_from_env() does the rest. Seven providers supported out of the box.

Playwright-powered open_url

In-process Chromium fetches JS-rendered pages with a realistic UA, handles anti-bot 503s, and falls back to stdlib urllib if Playwright isn't installed. No external scraper services.

Parallel tool execution

When the LLM returns multiple tool calls, run them concurrently with parallel_tool_execution=True. Results stay in order. Typically 2-3x faster for multi-tool turns.

Hooks & middleware

AgentHooks with @on_before_llm, @on_after_llm, @on_before_tool, @on_after_tool for cost tracking, rate limiting, content filtering, and guardrails. No subclassing.

Async runtime

AsyncAgentRuntime with async run() and async stream() for FastAPI, Starlette, and modern async Python. Same features as the sync runtime.

Graceful error recovery

Tool failures produce error messages instead of crashing the run. The LLM sees the error and can try a different approach. Safer retry defaults prevent retrying on bugs.


Next steps


Try it now — runnable examples

The repo ships with 7 numbered, copy-pasteable examples covering every major feature. Pick one and run it in 30 seconds.

#WhatRun
1Hello, agent. The shortest possible runnable examplepython examples/01_hello_agent.py
2Live streaming with colored reasoning eventspython examples/02_streaming_with_reasoning.py
3Same agent, 5 different LLM providers back-to-backpython examples/03_provider_swap.py
4End-to-end research workflow with web search + URL fetchingpython examples/04_research_agent.py "your question"
5Custom tools — function-style and class-stylepython examples/05_custom_tool.py
6Persistent chat session with file-backed memorypython examples/06_chat_session.py
7Semantic tool discovery with tool_searchpython examples/07_tool_search.py

See the full examples README →


Provider compatibility matrix

ProviderReasoning blocksTool callingStreamingBedrock pairingBuilt-in tools
OpenAI (o1, o3, o4, gpt-5)✅ Nativen/a
OpenAI (gpt-4o, gpt-4o-mini)n/a
Anthropic (claude-opus-4, claude-3.7)✅ Native (with thinking_budget_tokens)n/a
AWS Bedrock (gpt-oss-120b)✅ Via LiteLLM✅ Bulletproof
AWS Bedrock (anthropic.claude-*)✅ Via LiteLLM✅ Bulletproof
Google Gemini (gemini-1.5-pro)n/a
Google Vertex AIn/a
Groq (llama-3.3-70b)n/a
Together AIn/a
Ollama (local)n/a
DeepSeek R1 (via LiteLLM proxy)✅ Nativen/a
LiteLLM Proxy (self-hosted gateway)✅ Pass-throughn/a

Tip: if you want a "Thinking" panel UI without paying for o1/Claude, AWS Bedrock's openai.gpt-oss-120b-1:0 is the cheapest reasoning-capable model in the matrix and ships with Agent.with_builtins(llm=BedrockChatLLM()) out of the box.


What you get vs. what you don't

✅ shipit-agent does❌ shipit-agent does NOT do
Run agents with tools, MCP, memory, sessionsTrain models or fine-tune
Stream events incrementally as they happenProvide a hosted control plane
Extract reasoning blocks from any providerReplace LangChain / LangGraph / CrewAI wholesale
Guarantee Bedrock tool-pairing correctnessManage your cloud infrastructure
Support 9 LLM providers via one APILock you into a specific vendor
Ship with 28+ built-in toolsForce you to use any of them
Stay out of your way (small, focused runtime)Hide the agent loop behind abstractions

This is a library, not a framework. The runtime is small enough to read in one sitting (shipit_agent/runtime.py is under 400 lines). Bring your own LLM, tools, and storage; the runtime composes them and gets out of the way.