The Google Chrome team announced WebMCP last week. What is it? What problem does it solve? Why is it the future of the internet? Let’s dive in.
WebMCP allows AI agents to interact with webpages with a super high degree of accuracy.
But can’t they do that today, you ask?
Well .. they can, and there are two approaches, but both have drawbacks.
The Two Current Approaches
1. Screenshots
Exactly what it sounds like. AI agents take snapshots of screens, attempt to understand what’s going on by parsing the image, and then take a stab at completing forms. It’s a human-vision type approach, with the main drawback being the screenshot analysis. It can be inconsistent, it’s static, error handling can be patchy (think validation messages popping up after submit), and with multiple passes it’s not a fast process.
2. DOM Parsing
The other most widely used approach. It reads the underlying page HTML and attempts to understand how a booking form (for example) works .. what fields it has, what types they are, and so on. This too can be challenging. JavaScript-rendered content, frontend and backend validation, and the non-deterministic nature of AI agents makes this approach difficult as well.
Both approaches are essentially agents trying to reverse-engineer what a webpage does. It’s like reading a restaurant menu in a language you half understand and hoping you ordered what you wanted.
Enter WebMCP
From the article:
WebMCP aims to provide a standard way for exposing structured tools, ensuring AI agents can perform actions on your site with increased speed, reliability, and precision.
It sort of flips the model. Instead of agents guessing, websites declare, on a per-page basis, what capabilities each page contains. AI agents can then use those capabilities to make a booking, reserve a table, file a support ticket .. and it all happens through WebMCP, just like posting a form.
The standard proposes two APIs:
- Declarative API — handles standard actions defined in HTML forms. Think structured form submission with clear field definitions.
- Imperative API — manages complex, dynamic interactions requiring JavaScript execution. For the fancy stuff.
Why This Matters
This is a fundamental shift. Today, every AI agent that wants to interact with a website has to figure out how that site works on its own. With WebMCP, the website tells the agent exactly what it can do and how to do it. It’s the difference between fumbling with a lockpick and being handed the key.
SaaS providers could benefit massively here as well. Rather than having to develop and maintain their own MCP servers, they can embed the AI instructions right into their pages. Your booking platform, your CRM, your project management tool .. all agent-ready with a bit of markup.
What’s Next
Google is running an early preview program for developers to get access to docs and demos. If this gains traction (and with Chrome behind it, it likely will), we could be looking at a web where every site speaks a common language with AI agents.
That’s a big deal. That’s the future of the internet.