agent-browser Is a Useful CLI Tool Even Without an LLM

agent-browser is a headless browser CLI built by Vercel designed for AI agents. Give an LLM access to it and it can navigate pages, fill forms, take screenshots, and extract content without any custom integration.

But the CLI is handy on its own, too. Think of it like curl, but for the browser.

How people use it with LLMs

The idea is that you give an agent access to the CLI and just tell it what to do:

Fill out the contact form at example.com and submit it

The agent figures out the rest. It opens the page, reads the accessibility tree to find form fields, fills them in, and clicks submit. No selectors or custom integration code needed. The agent drives the browser the same way a human would, just faster.

Using it without an LLM

Screenshot any DOM node:

agent-browser open example.com && agent-browser screenshot "#example" ./example.png

Compare two URLs visually:

agent-browser diff url https://staging.example.com https://example.com --screenshot

Generate a PDF of any page:

agent-browser open example.com && agent-browser pdf ./example.pdf

Install it globally for the best experience:

npm install -g agent-browser
agent-browser install