Agent Web Interface

An MCP server for browser automation that exposes semantic, token-efficient page representations optimized for LLM agents.

Motivation

LLM-based agents operate under strict context window and token constraints. However, most browser automation tools expose entire DOMs or full accessibility trees to the model.

This leads to:

Rapid token exhaustion
Higher inference costs
Reduced reliability as relevant signal is buried in noise

In practice, agents spend more effort finding the right information than reasoning about it.

Agent Web Interface exists to change the unit of information exposed to the model.

Core Idea: Semantic Page Snapshots

Instead of exposing raw DOM structures or full accessibility trees, Agent Web Interface produces semantic page snapshots.

These snapshots are:

Compact and structured
Focused on user-visible intent
Designed for LLM recall and reasoning, not DOM completeness
Stable across layout shifts and DOM churn

The goal is not to mirror the browser, but to present the page in a form that aligns with how language models reason about interfaces.

How It Works

At a high level:

The browser is controlled via Puppeteer and CDP
The page is reduced into semantic regions and actionable elements
A structured snapshot is generated and sent to the LLM
Actions are resolved against stable semantic identifiers rather than fragile selectors

This separation keeps:

Browser lifecycle management isolated
Snapshots deterministic and low-entropy
Agent reasoning predictable and efficient

Benchmarks

Early benchmarks against Playwright MCP show:

~19% fewer tokens consumed
~33% faster task completion
Same or better success rates on common navigation tasks

Benchmarks were run using Claude Code on representative real-world tasks. Results are task-dependent and should be treated as directional rather than absolute.

What Agent Web Interface Is (and Is Not)

Agent Web Interface is:

A semantic interface between browsers and LLM agents
An MCP server focused on reliability and efficiency
Designed for agent workflows, not test automation

Agent Web Interface is not:

A general-purpose browser
A visual testing or screenshot framework
A replacement for Puppeteer

Puppeteer remains the execution layer; Agent Web Interface focuses on representation and reasoning.

Usage

Agent Web Interface implements the Model Context Protocol (MCP) and works with:

Claude Code
Claude Desktop
Cursor
VS Code
Any MCP-compatible client

Example workflows include:

Navigating complex web apps
Handling login and consent flows
Performing multi-step UI interactions with lower token usage

See the examples/ directory for concrete agent workflows.

Claude Code

# Basic (auto-launches browser)
claude mcp add agent-web-interface -- npx agent-web-interface@latest

# With auto-connect to your Chrome profile
claude mcp add agent-web-interface -- npx agent-web-interface@latest --autoConnect

# Headless mode
claude mcp add agent-web-interface -- npx agent-web-interface@latest --headless

CLI Arguments

The server accepts the following arguments to configure browser initialization:

Argument	Description	Default
`--headless=true\|false`	Run browser in headless mode	`false`
`--browserUrl`	HTTP endpoint to connect to existing browser	-
`--wsEndpoint`	WebSocket endpoint to connect to existing browser	-
`--autoConnect`	Auto-connect to Chrome 144+ via DevToolsActivePort	`false`
`--isolated`	Use isolated temp profile instead of persistent	`false`
`--userDataDir`	Chrome user data directory	Platform default
`--channel`	Chrome channel (chrome, chrome-canary, chrome-beta, chrome-dev)	`chrome`
`--executablePath`	Path to Chrome executable	-

Automatic Browser Initialization

The browser is automatically launched or connected on the first tool call. No explicit initialization is needed.

Examples:

# Auto-launch visible browser (default)
npx agent-web-interface

# Launch headless browser
npx agent-web-interface --headless

# Auto-connect to Chrome with remote debugging enabled
npx agent-web-interface --autoConnect

# Connect to specific endpoint
npx agent-web-interface --browserUrl http://localhost:9222

Using Your Existing Chrome Profile (Chrome 144+)

To connect with your bookmarks, extensions, and logged-in sessions:

Navigate to chrome://inspect/#remote-debugging in Chrome
Enable remote debugging and allow the connection
Use --autoConnect CLI argument

{
  "mcpServers": {
    "agent-web-interface": {
      "command": "npx",
      "args": ["agent-web-interface@latest", "--autoConnect"]
    }
  }
}

Environment Variables

Variable	Description	Default
`AWI_TRIM_REGIONS`	Set to `false` to disable region trimming globally	`true`
`CEF_BRIDGE_HOST`	CDP host for browser connection	`127.0.0.1`
`CEF_BRIDGE_PORT`	CDP port for browser connection	`9223`

Installation

git clone https://github.com/lespaceman/agent-web-interface
cd agent-web-interface
npm install
npm run build

Configure the MCP server in your client according to its MCP integration instructions.

Architecture Overview

Agent Web Interface separates concerns into three layers:

Browser lifecycle — page creation, navigation, teardown
Semantic snapshot generation — regions, elements, identifiers
Action resolution — mapping agent intent to browser actions

This separation allows each layer to evolve independently while keeping agent-visible behavior stable.

Status

Agent Web Interface is under active development. APIs and snapshot formats may evolve as real-world agent usage informs the design.

Feedback from practitioners building agent systems is especially welcome.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 222 Commits
.claude-plugin		.claude-plugin
.github		.github
commands		commands
docs		docs
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.npmignore		.npmignore
.prettierignore		.prettierignore
.prettierrc.json		.prettierrc.json
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
eslint.config.js		eslint.config.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Web Interface

Motivation

Core Idea: Semantic Page Snapshots

How It Works

Benchmarks

What Agent Web Interface Is (and Is Not)

Agent Web Interface is:

Agent Web Interface is not:

Usage

Claude Code

CLI Arguments

Automatic Browser Initialization

Using Your Existing Chrome Profile (Chrome 144+)

Environment Variables

Installation

Architecture Overview

Status

License

About

Uh oh!

Releases 15

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

lespaceman/agent-web-interface

Folders and files

Latest commit

History

Repository files navigation

Agent Web Interface

Motivation

Core Idea: Semantic Page Snapshots

How It Works

Benchmarks

What Agent Web Interface Is (and Is Not)

Agent Web Interface is:

Agent Web Interface is not:

Usage

Claude Code

CLI Arguments

Automatic Browser Initialization

Using Your Existing Chrome Profile (Chrome 144+)

Environment Variables

Installation

Architecture Overview

Status

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 15

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages