Technical blogTechnical article

How AI agents read websites

Agents do not just need access to the web. They need a usable reading surface. The most reliable systems separate navigation state from content state and let models reason over a cleaned representation.

Primary use
Why successful agents separate browser interaction from clean content extraction.
Recommended flow
Fetch, clean, measure tokens, then hand consistent Markdown to agents or retrieval systems.
Next step
Use the Playground to compare raw HTML against optimized output before integrating the API.

Navigation and reading are different tasks

Browsers are good at clicking, waiting and executing scripts. LLMs are better at reasoning over compact, structured text. Combining both concerns in raw DOM form is expensive and fragile.

A practical pattern

Use automation to reach the right page state, then extract normalized content for analysis, summarization or retrieval. That creates a smaller, more deterministic prompt surface.