ComputerUseAgent · v1.0.8

Self-host Devin.
In thirty lines of Python.

Name: SHIPIT Agent
Author: SHIPIT

Drive a real browser by showing screenshots to a vision-capable LLM. Use it standalone for one-shot workflows, or plug it into your main Agent as a browser_use tool. Anthropic native + plain-text fallback for any vision LLM.

Read the docs All v1.0.8 features Notebook ↗

Four steps. Endlessly composable.

The same loop drives a 4-iteration price lookup or a 30-iteration form-filling workflow. Recovery from failed actions is built in.

Step 01

Screenshot

Take a base64 PNG of the current viewport.

Step 02

Reason

Vision LLM looks at the screenshot + the goal, picks the next action.

Step 03

Act

Click, type, scroll, navigate, or signal `done`.

Step 04

Loop

Until the model emits `done` or `max_iterations`.

Two patterns

Standalone or as a tool inside your main Agent.

For one-shot browser work, run ComputerUseAgent directly. For production agents that mix browser work with web search, PDFs, RAG, or SQL, plug it in as BrowserAgentTool.

Pattern 1

Standalone ComputerUseAgent

single task

from shipit_agent.computer_use import (
    ComputerUseAgent, PlaywrightBrowserSession,
)
 
with PlaywrightBrowserSession.launch(headless=True) as browser:
    agent = ComputerUseAgent(
        llm=opus_llm,
        browser=browser,
        goal="Find iPhone 15 Pro starting price.",
        max_iterations=10,
    )
    result = agent.run()
    print(result.final_text)

Fastest path for one-shot workflows.Docs

Pattern 2 · Recommended for production

BrowserAgentTool inside main Agent

composable

from shipit_agent import Agent, VerifierNetwork
from shipit_agent.computer_use import (
    BrowserAgentTool, PlaywrightBrowserSession,
)
 
# 1. Browser tool — owns its own LLM + browser factory
browser_tool = BrowserAgentTool(
    llm=opus_llm,
    browser_factory=lambda: PlaywrightBrowserSession.launch(headless=True),
    max_iterations=12,
)
 
# 2. Optional: verifier so destructive actions get gated
verifier = VerifierNetwork(llm=haiku_llm, goal="Research only — no purchases.")
 
# 3. Plug into your main planning Agent — browser_use is one tool among many
agent = Agent(
    llm=opus_llm,
    tools=[browser_tool, WebSearchTool(), PDFTool()],
    verifier=verifier,
)
 
# 4. Run a high-level goal — the main agent decides when to call browser_use
result = agent.run(
    "Find the cheapest direct SFO-JFK flight on May 20 "
    "and summarise the booking page."
)

The main agent picks `browser_use` when it's the right tool — same way it picks `web_search` or `pdf_extract`.Docs

Real recipes

Production patterns where browser-driving agents actually pay off. Drop the goal into your main Agent, let `browser_use` handle the click-paths.

Recipe 01

Price comparison

Goal sent to the agent

"Find the lowest price for [item] across [3 sites] and report the cheapest with a link."

Headless browser visits each site, extracts price, returns a structured comparison. Pair with `output_schema=PriceCompare` for typed output.

Recipe 02

Form filling at scale

Goal sent to the agent

"Fill the application form with [name=…, email=…, message=…]. Pause before submitting."

Run completes when the agent reaches the Submit button. The human reviews the screenshot in `result.action_history[-1]` then clicks for real.

Recipe 03

End-to-end UI testing

Goal sent to the agent

"Sign up with email='test+{ts}@example.com' and verify the dashboard loads with the welcome banner. Report PASS or FAIL."

Adapts when the UI shifts (no flaky CSS selectors). Failure modes are captured in `action_history` for replay.

Recipe 04

Internal SaaS without an API

Goal sent to the agent

"Log in to the analytics dashboard, navigate to the weekly report, capture the top-line numbers."

Use `share_browser=True` so credentials persist across calls. Combine with the verifier to gate any destructive actions.

Drive any browser.
Self-hosted. Yours.

Devin and OpenAI Operator are SaaS products. Ours is a library — fork it, run it on your own LLM, ship it in your own product.

Read the full reference Notebook 58 — direct use Notebook 59 — as a tool Star on GitHub

pip install 'shipit-agent[anthropic,playwright]'

Self-host Devin.In thirty lines of Python.

Four steps. Endlessly composable.

Screenshot

Reason

Act

Loop

Standalone or as a tool inside your main Agent.

Standalone ComputerUseAgent

BrowserAgentTool inside main Agent

Real recipes

Price comparison

Form filling at scale

End-to-end UI testing

Internal SaaS without an API

Drive any browser.Self-hosted. Yours.

Self-host Devin.
In thirty lines of Python.

Drive any browser.
Self-hosted. Yours.