agent-browser Is a Useful CLI Tool Even Without an LLM
agent-browser is a headless browser CLI built by Vercel designed for AI agents. Give an LLM access to it and it can navigate pages, fill forms, take screenshots, and extract content without any custom integration.
But the CLI is handy on its own, too. Think of it like curl, but for the browser.
How people use it with LLMs
The idea is that you give an agent access to the CLI and just tell it what to do:
Fill out the contact form at example.com and submit it
The agent figures out the rest. It opens the page, reads the accessibility tree to find form fields, fills them in, and clicks submit. No selectors or custom integration code needed. The agent drives the browser the same way a human would, just faster.
Using it without an LLM
Screenshot any DOM node:
agent-browser open example.com && agent-browser screenshot "#example" ./example.png
Compare two URLs visually:
agent-browser diff url https://staging.example.com https://example.com --screenshot
Generate a PDF of any page:
agent-browser open example.com && agent-browser pdf ./example.pdf
Install it globally for the best experience:
npm install -g agent-browser
agent-browser install