browser-harness
NewAlways use browser-harness for any web interaction: automation, scraping, testing, or site/app work.
Summary
This skill provides direct browser control via Chrome DevTools Protocol (CDP), enabling automation, scraping, testing, and web page interaction.
- It connects to an already-running Chrome instance and uses a Python harness for reliable multi-line commands, making it ideal for developers who need to programmatically control the browser without manual intervention.
Overview
browser-harness
Direct browser control via CDP. For task-specific edits, use agent-workspace/agent_helpers.py. For setup, install, or connection problems, read https://github.com/browser-use/browser-harness/blob/main/install.md.
Domain skills are off by default. Set BH_DOMAIN_SKILLS=1 to enable them; see the bottom section.
If `BH_DOMAIN_SKILLS=1` and the task is site-specific, read every file in the matching `$BH_AGENT_WORKSPACE/domain-skills/<site>/` directory before inventing an approach.
Usage
browser-harness <<'PY'
print(page_info())
PY- •Invoke as
browser-harness. Use heredocs for multi-line commands. - •Helpers are pre-imported.
run.pycallsensure_daemon()beforeexec. - •First navigation is
new_tab(url), notgoto_url(url). - •The normal local flow attaches to the running Chrome/Chromium CDP endpoint. No browser ids or local profile selection.
Local Chrome
If the daemon cannot connect, run diagnostics:
browser-harness --doctorIf Chrome remote debugging is not enabled, the harness opens:
chrome://inspect/#remote-debuggingAsk the user to tick "Allow remote debugging for this browser instance" and click Allow if Chrome shows a permission popup. Then retry the same browser-harness command.
Remote Browsers
Use Browser Use cloud for headless servers, parallel sub-agents, or isolated work. Authenticate once:
browser-harness auth loginOr import a key safely:
browser-harness auth login --api-key-stdinPick a short made-up name; r7k2 below is just a placeholder:
browser-harness <<'PY'
start_remote_daemon("r7k2")
PY
BU_NAME=r7k2 browser-harness <<'PY'
new_tab("https://example.com")
print(page_info())
PYWhen the task is done and a cloud browser is still running, ask directly: "Should I close this browser now?" If yes, run stop_remote_daemon(name). Remote daemons bill until they stop or time out.
Do not start a remote daemon and then keep using the default daemon. Use the same name for BU_NAME.
Cloud profile cookie sync reference: https://github.com/browser-use/browser-harness/blob/main/interaction-skills/profile-sync.md.
Page Workflow
- •Screenshots first: use
capture_screenshot()to understand visible state. - •Clicking: screenshot -> read pixel ->
click_at_xy(x, y)-> screenshot again. - •After navigation, call
wait_for_load(). - •If the current tab is stale or internal, call
ensure_real_tab(). - •Use
js(...)for DOM inspection or extraction when coordinates are the wrong tool. - •Login walls: stop and ask. Exception: use available SSO automatically when Chrome is already signed in; still stop for passwords, MFA, consent, or ambiguous account choice.
- •Raw CDP is available with
cdp("Domain.method", ...).
Interaction Skills
If you get stuck on a browser mechanic, check https://github.com/browser-use/browser-harness/tree/main/interaction-skills.
- •connection.md
- •cookies.md
- •cross-origin-iframes.md
- •dialogs.md
- •downloads.md
- •drag-and-drop.md
- •dropdowns.md
- •iframes.md
- •network-requests.md
- •print-as-pdf.md
- •profile-sync.md
- •screenshots.md
- •scrolling.md
- •shadow-dom.md
- •tabs.md
- •uploads.md
- •viewport.md
Design Constraints
- •Coordinate clicks default. CDP mouse events pass through iframes/shadow/cross-origin at the compositor level.
- •Keep the connection model simple: use the default daemon,
BU_NAME,BU_CDP_URL,BU_CDP_WS, orstart_remote_daemon(...). - •Core helpers stay short. Put task-specific helper additions in
$BH_AGENT_WORKSPACE/agent_helpers.py.
Gotchas
- •
chrome://inspect/#remote-debuggingmust be enabled for local Chrome control. - •Chrome may show an "Allow remote debugging?" popup; wait for the user to click Allow.
- •Omnibox popups are not real work tabs.
- •CDP target order is not Chrome's visible tab-strip order.
- •
BU_CDP_URLis an HTTP DevTools endpoint; the daemon resolves it to WebSocket. - •Ask before leaving cloud browsers running; stop them with
stop_remote_daemon(name)orPATCH /browsers/{id} {"action":"stop"}.
Domain Skills
Only applies when BH_DOMAIN_SKILLS=1. Otherwise ignore domain skills.
When enabled, search $BH_AGENT_WORKSPACE/domain-skills/<host>/ before inventing an approach. goto_url(...) returns up to 10 skill filenames for the navigated host.
Install & Usage
mkdir -p .claude/skillsAdd the configuration to .claude/skills/browser-harness.md
/browser-harnessUse Cases
Usage Examples
/browser Open a new tab to https://example.com and print the page title
Use browser-harness to scrape all product prices from a shopping site and save to CSV
Automate login to a web app, fill a form, and verify the success message appears
Security Audits
Frequently Asked Questions
What is browser-harness?
This skill provides direct browser control via Chrome DevTools Protocol (CDP), enabling automation, scraping, testing, and web page interaction. It connects to an already-running Chrome instance and uses a Python harness for reliable multi-line commands, making it ideal for developers who need to programmatically control the browser without manual intervention.
How to install browser-harness?
To install browser-harness: create the skills directory (mkdir -p .claude/skills), then add the config to .claude/skills/browser-harness.md. Finally, /browser-harness in Claude Code.
What is browser-harness best for?
browser-harness is a other categorized under General. It is designed for: testing, api. Created by browser-use.
What can I use browser-harness for?
browser-harness is useful for: Automate form submissions and data extraction on dynamic web pages.; Run end-to-end tests by navigating, clicking, and asserting page content.; Scrape content from multiple pages with pagination or infinite scroll.; Interact with web apps that require login sessions or complex user flows.; Take screenshots or capture network activity for debugging or monitoring.; Orchestrate parallel browser sessions for load testing or multi-account workflows..