Without the vision model.
WebScope renders any web page into a structured text grid that LLMs can read, understand, and interact with — no screenshots, no vision APIs, no pixel parsing.
or try it below ↓
Why WebScope?
Same page, radically different cost.
[0] Hacker News [1] new | [2] past | [3] comments [4] login 1. [5] Show HN: WebScope – text-grid browser for AI agents (github.com) 142 pts by adityapandey 3h ago | [6] 89 comments 2. [7] Why LLMs don't need screenshots 87 pts by somebody 5h ago | [8] 34 comments [9:______________________] [10 Search]
Features
Everything your agent needs to browse the web, in one package.
See where elements are on the page, not just what they are. Proximity, grouping, and visual hierarchy — all intact.
Zero API calls to GPT-4V or Claude Vision. The output is plain text — native to how LLMs already think.
Real Chromium renders SPAs, auth flows, and dynamic content — exactly like a browser would.
Persistent cookies, localStorage, and auth state across page transitions. Resume flows anytime.
Isolated sessions run in parallel. Multiple agents browse simultaneously without collisions.
16 tools, plug into Claude Desktop, Cursor, Windsurf, or Cline with zero configuration.
Live Demo
One command. Real output. ~500 bytes.
Rendering: https://example.com ════════════════════════════════════════ Example Domain ════════════════════════════════════════ This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission. [0]More information... ──────────────────────────────────────── Elements: 1 interactive | Rendered in 84ms
Integrations
Drop WebScope into whatever you're already using.
// Claude Desktop — claude_desktop_config.json { "mcpServers": { "webscope": { "command": "webscope-mcp" } } }
import json with open("tools/tool_definitions.json") as f: webscope_tools = json.load(f)["tools"] response = openai.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": "Go to example.com"}], tools=webscope_tools, )
from tools.langchain import get_webscope_tools # Start the server: webscope --serve 3000 tools = get_webscope_tools(base_url="http://localhost:3000") agent = initialize_agent(tools, llm, agent="zero-shot-react-description") agent.run("Find the top story on Hacker News")
from tools.crewai import ( WebScopeBrowseTool, WebScopeClickTool, WebScopeTypeTool ) researcher = Agent( role="Web Researcher", tools=[WebScopeBrowseTool(), WebScopeClickTool()], llm=llm, )
# Start the server $ webscope --serve 3000 # Navigate $ curl -X POST http://localhost:3000/navigate \ -H 'Content-Type: application/json' \ -d '{"url": "https://example.com"}' # Interact $ curl -X POST http://localhost:3000/click -d '{"ref": 3}' $ curl -X POST http://localhost:3000/type -d '{"ref": 7, "text": "hello"}'
const { AgentBrowser } = require('webscope'); const browser = new AgentBrowser({ cols: 120 }); const { view, elements, meta } = await browser.navigate( 'https://example.com' ); console.log(view); // The text grid console.log(elements); // { 0: { selector, tag, text, href }, ... } await browser.click(3); await browser.type(7, 'hello'); await browser.close();
Under the Hood
A real Chromium instance loads the page with full JavaScript and CSS execution.
Every visible element's position, size, text, and interactivity is captured via getBoundingClientRect().
Pixel coordinates are converted to character grid positions, preserving spatial layout.
Interactive elements get [ref] numbers so agents can click, type, and scroll by reference.
Use Cases
Automate multi-step ATS flows across Greenhouse, Lever, and custom portals.
Extract structured data from any page. Search, navigate, and aggregate results programmatically.
Verify layouts and content without pixel matching. Assert field values across page transitions.
Fill, validate, and submit complex forms across multi-step wizards.
Track changes on live pages. Detect updates, compare snapshots, alert on diffs.
Data Privacy: WebScope runs entirely on your machine. No data is collected, transmitted, or stored by us. Your browsing sessions are yours alone.