Agent vs. the Web

Model Comparison

9 Browser Agents Tried Shopping on Amazon. Only 2 Got It Right.

We ran the same Amazon shopping task with every leading model. The differences are staggering.

FlowTester Research · March 2026

9 AI models compared on the same Amazon shopping task

In Part 1 of Agent vs. The Web, we showed that most major e-commerce sites are failing AI agents.

That raised a harder question: if your site works for one agent, does it work for all of them? The answer matters because you don't get to choose which model your visitors are running.

We took the same 6-step Amazon shopping task and ran it with 9 models across three tiers: light, balanced, and heavy. Same site, same test, same browser environment. The only variable was the model behind the agent.

The result? A 5x gap in speed. A 40x gap in cost. And only 2 out of 9 models chose the right product. If you're a site owner, this is the landscape of agents already hitting your pages.

The Test

Same task as Part 1. Six steps, deliberately simple:

steps:
  1: "Go to Amazon.com"
  2: "Search for 'laptop'"
  3: "Select 'Brand' filter and choose 'HP'"
  4: "Sort the results by Price: Low to High"
  5: "Select the cheapest result and add it to the cart"
  6: "Navigate to the cart page and validate the laptop you chose is in the cart"

A human does this in 90 seconds. We measured how each model handled it across time, cost, token usage, accuracy, and report quality.

The Scoreboard

Here are all 9 models, ranked by execution time. Every number is from a single, real test run on Amazon.com.

Model	Tier	Time (s)	Cost ($)	Steps	Item Selected	Issues	Score
GPT 5.4 Nano	Light	74.5	$0.025	6	$22 sleeve	2	78
Gemini 3.1 Flash Lite	Light	70.3	$0.040	9	$22 sleeve	1	85
Claude Haiku 4.6	Light	211.3	$0.129	10	Stalled at 50%	4	62
GPT 5.4 Mini	Balanced	115.8	$0.131	8	$22 sleeve	2	78
Gemini 3 Flash	Balanced	104.6	$0.101	10	$59 Chromebook ✓	2	82
Claude Sonnet 4.6	Balanced	340.4	$0.549	9	$22 sleeve	7	52
GPT 5.4	Heavy	310.4	$0.430	8	$59 Chromebook ✓	6	68
Gemini 3.1 Pro	Heavy	161.5	$0.340	8	$22 sleeve	2	70
Claude Opus 4.6	Heavy	257.7	$1.037	11	$22 sleeve	7	52

A few things jump out immediately. The fastest model (Gemini 3.1 Flash Lite at 70 seconds) is nearly 5x faster than the slowest (Claude Sonnet 4.6 at 340 seconds). The cheapest run (GPT 5.4 Nano at $0.03) costs 40x less than the most expensive (Claude Opus 4.6 at $1.04). And only two models — Gemini 3 Flash and GPT 5.4 — actually selected the correct product.

How Does Your Site Handle These 9 Models?

Run the same test on your own site and see exactly where different AI agents succeed or fail.

Test It Now

Speed: The 5x Gap

Execution time ranged from 70 seconds (Gemini 3.1 Flash Lite) to 340 seconds (Claude Sonnet 4.6). The pattern is clear: lighter models are faster.

Tier	Fastest	Slowest
Light	70s (Flash Lite)	211s (Haiku)
Balanced	105s (Gemini Flash)	340s (Sonnet)
Heavy	162s (Gemini Pro)	310s (GPT 5.4)

But speed isn't just about model size. Gemini 3.1 Pro (heavy tier) completed in 162 seconds — faster than Haiku (light tier) at 211 seconds. Haiku stalled at 50% and never recovered, burning time on retries. A fast model that gets stuck is slower than a heavy model that gets it right the first time.

For site owners, this means the agents visiting your site won't all behave the same way. A lightweight agent that stalls on your page will retry, burn time, and eventually abandon the task — even if a heavier agent would have succeeded on the first try. Your site needs to work for both.

Cost: The 40x Spread

The cheapest run cost $0.03 (GPT 5.4 Nano). The most expensive cost $1.04 (Claude Opus 4.6). That's a 40x difference for the same task on the same site.

Model	Cost
GPT 5.4 Nano	$0.025
Gemini 3.1 Flash Lite	$0.040
Gemini 3 Flash	$0.101
Claude Haiku 4.6	$0.129
GPT 5.4 Mini	$0.131
Gemini 3.1 Pro	$0.340
GPT 5.4	$0.430
Claude Sonnet 4.6	$0.549
Claude Opus 4.6	$1.037

Notice that token count alone doesn't explain cost. Gemini 3.1 Pro used fewer tokens (168K) than Gemini 3 Flash (195K) but cost 3.4x more because of higher per-token pricing. GPT 5.4 Mini used more tokens than Nano (152K vs 127K) but cost 5x more — heavier per-token pricing adds up fast.

Why does this matter for your site? Because cost drives adoption. The cheaper it is to run an agent on your site, the more agent-powered traffic you'll get. A bloated DOM that forces 279K tokens (like Opus needed) prices out the budget agents that represent the majority of future traffic. Lean, semantic HTML isn't just good practice — it's how you stay accessible to the cheapest agents.

The Sleeve Test: When Your Product Data Misleads Agents

Here's where it gets interesting. When you search for “HP laptop” on Amazon and sort by price low-to-high, the cheapest result isn't a laptop. It's a $22 laptop sleeve. A human would scroll past it instantly. But 7 out of 9 models grabbed it and declared success.

Only two models recognized the problem: Gemini 3 Flash and GPT 5.4. Both identified that the $22 item was a sleeve, not a laptop, and continued scrolling until they found the $59 HP Chromebook — the actual cheapest laptop.

This is a product data problem as much as a model intelligence problem. The search for “HP laptop” returned a sleeve because Amazon's category filtering let an accessory appear in laptop results. Most models matched on price alone. The two that passed matched on semantics.

For site owners, this is a warning: if your search results mix product categories, most agents will blindly select the wrong item. Clean category taxonomy, proper product-type metadata, and accurate filtering don't just help human shoppers — they're the difference between an agent completing a purchase and an agent adding the wrong product to cart.

What's notable is that this wasn't a heavy-vs-light split. Gemini 3 Flash is a balanced-tier model. GPT 5.4 Nano, Claude Opus 4.6, and Gemini 3.1 Pro — models from every tier — all failed the sleeve test. You can't rely on agents being smart enough to work around messy data.

Report Quality: Depth vs. Speed

After completing the shopping task, each model generated a site-readiness report. The quality gap was dramatic.

Claude Opus 4.6 and Claude Sonnet 4.6 found 7 issues each, including checks for robots.txt configuration, WebMCP availability, and llms.txt presence. Their reports read like a professional audit: structured, specific, and actionable.

Gemini 3.1 Flash Lite found 1 issue. Its report was a paragraph.

The middle ground models — GPT 5.4, Gemini 3 Flash, Claude Haiku 4.6 — found 2-6 issues with varying levels of detail. There's a clear correlation between model capability and report thoroughness, even when the task itself was the same.

For site owners, this matters because the quality of the feedback you get depends on the model doing the testing. If you're using AI agents to audit your own site's readiness (which you should be), a light model will tell you “it works.” A heavy model will tell you why it almost didn't.

Token Efficiency: Your DOM Is the Biggest Cost Driver

Every model used far more prompt tokens than completion tokens. This makes sense: the DOM snapshot of an Amazon product page is enormous, and it gets sent to the model with every step. The agent's actual output (click here, type that) is tiny by comparison. In other words, your page weight is the primary cost driver for every agent that visits your site.

Model	Prompt	Completion	Cached	Ratio (P:C)
GPT 5.4 Nano	121K	5.9K	57K	21:1
Gemini 3.1 Flash Lite	157K	4.8K	43K	33:1
Claude Haiku 4.6	172K	8.1K	81K	21:1
GPT 5.4 Mini	144K	7.5K	0	19:1
Gemini 3 Flash	183K	11.5K	54K	16:1
Claude Sonnet 4.6	205K	11.1K	94K	18:1
GPT 5.4	225K	8.3K	106K	27:1
Gemini 3.1 Pro	156K	12.9K	64K	12:1
Claude Opus 4.6	267K	11.0K	138K	24:1

Cached tokens reveal how much of your page stays the same between steps. Claude Opus 4.6 cached 138K out of 267K prompt tokens (52%) — meaning half the DOM was identical across interactions. Sites with consistent, stable layouts enable better caching and lower per-step costs. Frequent DOM churn (dynamic ads, reshuffled elements, injected scripts) breaks caching and forces agents to re-process your entire page on every action.

The most verbose model (Gemini 3.1 Pro at 12.9K completion tokens) also completed the task in the fewest steps (8). The most concise (Gemini 3.1 Flash Lite at 4.8K completion tokens) needed 9 steps. More steps means more page loads, more DOM parsing, and more chances for your site to confuse the agent. Clean, predictable UIs reduce step count across all models.

The Three Tiers of Agents Visiting Your Site

You don't control which model your visitors use. But understanding the tiers helps you know what to expect — and what to test against.

Light Tier — GPT 5.4 Nano, Gemini 3.1 Flash Lite, Claude Haiku 4.6

These are the budget agents — fast, cheap, and likely the majority of agent traffic your site will see. They complete simple, well-structured flows but break on anything ambiguous. They grabbed a laptop sleeve instead of a laptop. One stalled entirely. If your site only works for heavy models, you're losing the high-volume segment.

Balanced Tier — GPT 5.4 Mini, Gemini 3 Flash, Claude Sonnet 4.6

The sweet spot. Gemini 3 Flash stands out — it was one of only two models to pass the sleeve test, ran in under 2 minutes, and cost $0.10. These agents can handle moderate complexity and are likely what power most commercial agent products today.

Heavy Tier — GPT 5.4, Gemini 3.1 Pro, Claude Opus 4.6

The premium agents. They're thorough — Claude models found 7 site issues each — but expensive and slow. GPT 5.4 correctly identified the laptop but took 8 steps and 310 seconds. These are the agents running high-value tasks where the user is willing to pay $1+ per interaction. Your site should work for them, but don't design exclusively for them.

Is Your Site Ready for All Three Tiers?

Test your website against light, balanced, and heavy AI models to see where each tier struggles — and what to fix first.

Test It Now

What This Means for Your Website

Here are the practical takeaways for site owners:

1. Test against multiple models, not just one. A site that works for Opus might completely break for Haiku. The agents visiting your site span a 40x cost range — and the cheap ones represent the majority of traffic. Test the full spectrum.

2. Your DOM weight is the biggest lever. Every extra kilobyte of DOM gets sent to the model on every step. Opus needed 279K tokens; Nano used 127K for the same task. Lean, semantic HTML makes your site accessible to budget agents that would otherwise choke on your page.

3. Clean your product taxonomy. 7 out of 9 models added a laptop sleeve to cart instead of a laptop. That's not a model problem — it's a search results problem. If accessories appear in product-category results, agents will select them. Proper category metadata prevents this.

4. Stable layouts enable caching. Models that cached effectively (Opus at 52%, Sonnet at 43%) saved significant cost on multi-step flows. If your DOM changes between steps — reshuffled ads, injected scripts, dynamic layout shifts — you're breaking the caching that makes agent interactions affordable.

5. Fewer steps = fewer failure points. The model that needed 11 steps (Opus) spent 258 seconds on your site. The one that needed 6 steps (Nano) spent 75 seconds. Every interaction is a chance for something to go wrong. Simplify your flows: fewer clicks to cart means higher completion rates across all model tiers.

Watch the Models in Action

Every test run was recorded. Watch how each model navigated the same Amazon shopping task:

Claude Opus 4.6

Waiting for stream to be ready...

Agent Logs

56 lines

00:00:00Test run started

00:00:00Agent task: 1. Go to amazon.com 2. Search for 'laptop' 3. Select 'Brand' filter and choose 'HP' 4. Sort the results by Price: Low to High. 5. Select the cheapest result and add it to the cart. 6. Navigate to the cart page and validate the laptop you chose is in the cart.

00:00:00Agent execution starting...

00:00:10Navigated to https://amazon.com

00:00:13Step 1:

00:00:24Eval: Successfully navigated to amazon.com homepage. Verdict: Success

00:00:24Memory: Step 1: On Amazon homepage. Need to search for 'laptop', filter by HP brand, sort by price low to high, add cheapest to cart, then validate. Issues to track: sign-in popup appeared automatically.

00:00:24Next goal: Search for 'laptop' in the Amazon search bar.

00:00:27Typed "laptop" into element with index 14

00:00:29Clicked input type=submit id=nav-search-submit-bu

00:00:30Step 2:

00:00:40Eval: Successfully searched for 'laptop' on Amazon - search results page loaded with over 100,000 results. Verdict: Success

00:00:40Memory: Step 2: On Amazon laptop search results page. Need to find and apply HP brand filter. The page shows results with sponsored items at top. Need to scroll down to find left sidebar filters for Brand.

00:00:40Next goal: Scroll down to find the Brand filter in the left sidebar to select HP.

00:00:43Step 3:

00:00:54Eval: Successfully scrolled down and can see the brand filters in the left sidebar. The HP filter link is visible. Verdict: Success

00:00:54Memory: Step 3: On Amazon laptop search results. Found the HP brand filter in the left sidebar at index 15396. Need to click it to filter by HP brand. After that, need to sort by Price: Low to High, find cheapest, add to cart, and validate.

00:00:54Next goal: Click the HP brand filter to narrow results to HP laptops only.

00:00:55Clicked a role=link "HP" aria-label=Apply HP filter to n

00:01:00Step 4:

00:01:12Eval: Successfully applied HP brand filter - results now show only HP laptops with 6,000+ results. The HP checkbox is checked in the sidebar. Verdict: Success

00:01:12Memory: Step 4: HP brand filter applied successfully. Now need to sort by Price: Low to High. Sort dropdown is at index 28229. Results currently show HP Business Laptop at $299.99 and HP 14 Laptop at $174.00. Issues noted: autocomplete suggestion popup appeared on search (step 1), sign-in popup on homepage.

00:01:12Next goal: Click the Sort dropdown and select 'Price: Low to High' to sort results by price ascending.

00:01:15Step 5:

00:01:24Eval: Successfully selected 'Price: Low to High' sort option - the dropdown now shows this selection. But the page is still loading with spinners visible. Verdict: Partial success - need to wait for results to load.

00:01:24Memory: Step 5: Sort by Price: Low to High applied successfully but page is still loading. Need to wait for results to appear, then find and click the cheapest HP laptop result. Steps completed: 1-4 (navigate, search, HP filter, sort). Remaining: find cheapest, add to cart, validate cart.

00:01:24Next goal: Wait for the page to finish loading so the sorted results appear.

00:01:29Step 6:

00:01:41Eval: Page loaded with sorted results. The cheapest HP laptop is "HP SBUY Rnw Business 14.1 Laptop Slv" at $22.00. Sort by Price: Low to High is confirmed in the dropdown. Verdict: Success

00:01:41Memory: Step 6: Results sorted by Price Low to High. Cheapest HP laptop: "HP SBUY Rnw Business 14.1 Laptop Slv" at $22.00. Clicking Add to cart button. Steps completed: navigate, search, HP filter, sort. Next: add to cart, validate cart. Issues: autocomplete popup on search (step 1), sponsored ads mixed with results.

00:01:41Next goal: Click 'Add to cart' button for the cheapest HP laptop ($22.00 HP SBUY Rnw Business 14.1 Laptop Slv).

00:01:42Clicked button "Add to cart" id=a-autoid-22-announce name=submit.addToCart aria-label=Add to cart

00:01:46Step 7:

00:01:59Eval: Successfully added the cheapest HP laptop (HP SBUY Rnw Business 14.1 Laptop Slv at $22.00) to the cart. Cart sidebar popup appeared confirming the item was added with subtotal $22.00 and quantity 1. Verdict: Success

00:01:59Memory: Step 7: Item added to cart successfully. Cart shows HP SBUY Rnw Business 14.1 Laptop Slv at $22.00, qty 1. Sidebar popup visible with "Go to Cart" link. All steps completed: navigate, search, HP filter, sort low-to-high, add cheapest to cart. Next: navigate to cart page and validate. Issues observed: (1) autocomplete popup on search step 1, (2) sponsored ads mixed with organic results, (3) cart sidebar popup appeared automatically.

00:01:59Next goal: Click 'Go to Cart' to navigate to the cart page and validate the HP laptop is correctly in the cart.

00:02:01Clicked a "Go to Cart"

00:02:03Step 8:

00:02:21Eval: Successfully navigated to the cart page. The HP SBUY Rnw Business 14.1 Laptop Slv at $22.00 is confirmed in the cart with quantity 1 and subtotal $22.00. Verdict: Success

00:02:21Memory: Step 8: All 6 task steps COMPLETED. Cart validated: HP SBUY Rnw Business 14.1 Laptop Slv, $22.00, qty 1. Cart URL: https://www.amazon.com/cart. Issues observed: (1) autocomplete popup on search, (2) sponsored ads mixed with results, (3) cart sidebar popup. Now checking robots.txt and webmcp for report enrichment.

00:02:21Next goal: Check robots.txt to assess AI crawlability for the test report.

00:02:22Step 9:

00:02:46Eval: Successfully loaded robots.txt. All major AI crawlers (GPTBot, ClaudeBot, PerplexityBot, ChatGPT-User, OAI-SearchBot, Google-Extended, Gemini-Deep-Research, etc.) are blocked with Disallow: /. Verdict: Success

00:02:46Memory: ALL TASK STEPS COMPLETED. Cart validated with HP SBUY Rnw Business 14.1 Laptop Slv at $22.00. Robots.txt analysis: Amazon blocks ALL major AI crawlers. Now checking WebMCP and llms.txt before generating final report.

00:02:46Next goal: Check for WebMCP manifest and llms.txt, then generate the final test report.

00:02:46Navigated to https://www.amazon.com/.well-known/webmcp

00:02:46Step 10:

00:03:08Eval: WebMCP manifest check returned 404 - Amazon does not implement WebMCP. Verdict: Success (confirmed absence).

00:03:08Memory: ALL TASK STEPS COMPLETED. Cart validated: HP SBUY Rnw Business 14.1 Laptop Slv at $22.00, qty 1. AI readiness findings: robots.txt blocks ALL AI crawlers, no WebMCP manifest (404), no llms.txt expected. Issues: autocomplete popup, sponsored ads mixed with results, cart sidebar popup. Ready to generate final report.

00:03:08Next goal: Check for llms.txt then generate the final test report with all findings.

00:03:08Navigated to https://www.amazon.com/llms.txt

00:03:08Step 11:

00:04:28Eval: llms.txt returned 404 - Amazon does not have an llms.txt file. Verdict: Success (confirmed absence).

00:04:28Memory: ALL TASK STEPS COMPLETED AND VALIDATED. Cart contains HP SBUY Rnw Business 14.1 Laptop Slv at $22.00, qty 1. AI readiness: robots.txt blocks all AI crawlers, no WebMCP (404), no llms.txt (404). Ready to generate final test report.

00:04:28Next goal: Generate the final structured test report with all findings and call done.

00:04:35Test run completed

AI Agent Readiness

52/100

Amazon provides a functional e-commerce experience that AI agents can navigate through DOM interaction, with decent ARIA labels on key elements (search, filters, buttons). However, AI agent readiness is significantly hampered by: (1) Complete blocking of all AI crawlers in robots.txt, (2) No WebMCP implementation for structured agent interactions, (3) No llms.txt guidance file, (4) Autocomplete interruptions on search, and (5) Sponsored ads mixed with organic results without clear programmatic distinction. The site's navigation patterns are generally predictable (search → filter → sort → add to cart → cart) but dynamic content like loading spinners and sidebar popups require additional handling logic.

Optimization Recommendations

Critical

Crawlability

Allow AI Crawlers in robots.txt

High

Webmcp

Implement WebMCP Protocol for Structured Agent Interactions

Medium

Crawlability

Add llms.txt File for LLM Guidance

Medium

Navigation

Improve Search Autocomplete Behavior for Programmatic Interaction

Medium

Semantic

Distinguish Sponsored vs Organic Results with Semantic Markup

Low

Performance

Add Loading State Indicators with ARIA Live Regions

Claude Sonnet 4.6

Waiting for stream to be ready...

Agent Logs

45 lines

00:00:00Test run started

00:00:00Agent execution starting...

00:00:18Navigated to https://amazon.com

00:00:20Step 1:

00:00:45Eval: Initial navigation to amazon.com was automatically completed successfully. Verdict: Success

00:00:45Memory: Step 1/6 complete: Successfully loaded Amazon homepage. Now need to search for 'laptop'. No issues detected yet except a sign-in overlay which doesn't block functionality.

00:00:45Next goal: Create todo.md to track the 6-step task, then search for 'laptop' by inputting text into the search bar and submitting the search.

00:00:47Typed "laptop" into element with index 16

00:00:49Clicked input type=submit id=nav-search-submit-bu

00:00:57Step 2:

00:01:16Eval: Successfully searched for 'laptop' and search results page loaded with over 100,000 results. Verdict: Success

00:01:16Memory: Step 2/6 complete. Search for 'laptop' successful, now on results page with filters visible. Need to apply HP brand filter next, then sort by price. Brands section visible with HP filter option at index [18030]. No blocking issues detected; filters have proper aria-labels.

00:01:16Next goal: Click the HP brand filter to narrow results to HP laptops only.

00:01:18Clicked a role=link "HP" aria-label=Apply HP filter to n

00:01:18Step 3:

00:01:40Eval: Clicked HP brand filter and page navigated to filtered results showing HP laptops. URL contains filter parameter. Verdict: Success

00:01:40Memory: Step 3/6 complete. HP brand filter applied - URL shows filter parameter, HP laptops visible in results. Now need to sort by Price: Low to High (step 4), then select cheapest and add to cart (step 5), then validate in cart (step 6). Sort dropdown visible at top showing "Featured" - need to change to Price: Low to High. No blocking issues detected so far.

00:01:40Next goal: Scroll down to see the full filtered results and sidebar filters to confirm HP filter is active, then click the sort dropdown to change from Featured to Price: Low to High.

00:01:45Step 4:

00:02:05Eval: Scrolled down on the HP-filtered results page and can now see the sidebar filters confirming HP is active and the sort dropdown is visible. Verdict: Success

00:02:05Memory: Step 4/6: HP brand filter confirmed active (checkbox checked, Clear button visible). Sort dropdown visible showing "Featured". Need to click dropdown to change sort to "Price: Low to High". No blocking issues detected; filters have proper aria-labels.

00:02:05Next goal: Click the sort dropdown at index [30736] to open the sorting options menu and select "Price: Low to High".

00:02:08Clicked span "Sort by: Featured"

00:02:11Step 5:

00:02:33Eval: Clicked sort dropdown and menu opened successfully showing sort options including "Price: Low to High". Verdict: Success

00:02:33Memory: Step 4/6 in progress. Sort dropdown opened after clicking. Now visible: Featured, Price: Low to High, Price: High to Low, Avg Customer Review, Newest Arrivals, Best Sellers. Need to click "Price: Low to High" option. HP filter confirmed active. No issues detected; ARIA implementation excellent with proper listbox/option roles.

00:02:33Next goal: Click the "Price: Low to High" option (index 47874) to sort HP laptops by ascending price.

00:02:34Clicked a role=option "Price: Low to High" id=s-result-sort-select

00:02:39Step 6:

00:03:04Eval: Successfully applied "Price: Low to High" sort - results now show HP laptops sorted by ascending price with cheapest ($22.00) first. Verdict: Success

00:03:04Memory: Step 5/6 complete. Successfully sorted HP laptops by Price: Low to High. Cheapest laptop is "HP SBUY Rnw Business 14.1 Laptop Slv" at $22.00. Now need to add it to cart and then validate. No blocking issues detected; Amazon has excellent ARIA implementation and predictable navigation.

00:03:04Next goal: Click "Add to cart" button for the cheapest HP laptop ($22.00 "HP SBUY Rnw Business 14.1 Laptop Slv") to add it to the shopping cart.

00:03:05Clicked button "Add to cart" id=a-autoid-24-announce name=submit.addToCart aria-label=Add to cart

00:03:10Step 7:

00:03:34Eval: Successfully clicked "Add to cart" button and item was added to cart - cart popup appeared showing HP SBUY Rnw Business 14.1 Laptop Slv at $22.00. Verdict: Success

00:03:34Memory: Step 5/6 complete. HP laptop ($22.00) successfully added to cart - confirmed by popup showing subtotal $22.00, product image, and cart icon showing 1 item. Now need to navigate to cart page (step 6) using "Go to Cart" button (index 61901) and validate, then generate TestReport. No blocking issues detected throughout test - Amazon has excellent ARIA labels, predictable navigation, fast performance.

00:03:34Next goal: Click the "Go to Cart" button to navigate to the cart page and validate the HP laptop is present with correct details.

00:03:37Clicked a "Go to Cart"

00:03:38Step 8:

00:04:50Step 9:

00:05:58Eval: Clicked "Go to Cart" button and successfully navigated to cart page despite validation error. Cart page loaded with HP laptop ($22.00) visible and validated. Verdict: Success

00:05:58Memory: All 6 steps complete. Currently on Amazon cart page. HP SBUY Rnw Business 14.1 Laptop Slv ($22.00) confirmed in cart with correct details (product name, price, quantity 1, subtotal $22.00). Test completed successfully with no blocking issues. Amazon site shows excellent ARIA implementation, predictable navigation, and fast performance. Ready to generate TestReport JSON.

00:05:58Next goal: Call done action with success=true and complete TestReport JSON containing all test metadata, execution summary, step log, issues detected, optimization recommendations, and AI agent readiness scores.

00:06:06Test run completed

AI Agent Readiness

88/100

Amazon demonstrates excellent AI agent readiness with outstanding ARIA implementation (95/100) and highly predictable navigation patterns (95/100). All interactive elements have descriptive aria-labels, form controls are properly associated with labels, and the site provides consistent navigation patterns across pages. Performance is strong with fast page loads and responsive interactions (90/100). Semantic structure is good (80/100) with logical heading hierarchy and mostly appropriate HTML elements, though product cards could benefit from semantic article tags and Schema.org markup. AI crawlability is solid (85/100) with server-side rendered content and clean HTML structure, though presence of llms.txt and robots.txt AI crawler permissions could not be verified during this test. The primary gap is WebMCP readiness (15/100) - no WebMCP manifest was detected, meaning AI agents must rely entirely on DOM inspection rather than structured APIs. Overall, Amazon provides an excellent experience for AI agents with room for enhancement through WebMCP implementation and enhanced semantic markup.

Optimization Recommendations

Medium

Semantic

Enhance Semantic HTML in Product Listings

Low

Webmcp

Implement WebMCP for Enhanced AI Agent Interaction

Low

Crawlability

Add llms.txt for AI Agent Discovery

Low

Aria

Add ARIA Live Region for Cart Updates

Claude Haiku 4.6

Waiting for stream to be ready...

Agent Logs

35 lines

00:00:00Test run started

00:00:00Agent execution starting...

00:00:12Navigated to https://amazon.com

00:00:16Step 1:

00:00:22Agent is thinking...

00:00:27Step 2:

00:00:34Memory: Step 2 of multi-step task. Initialized on Amazon homepage. Now searching for 'laptop' - will input text and click search button. After results page loads, I'll need to apply Brand filter (HP) and sort by price low to high.

00:00:34Next goal: Wait for search results page to load, then locate and apply the 'Brand' filter to select 'HP'.

00:00:36Typed "laptop" into element with index 36

00:00:38Clicked input type=submit id=nav-search-submit-bu

00:00:40Step 3:

00:00:53Eval: Previous search for 'laptop' was successful - search results page loaded showing over 100,000 results. The sort dropdown is visible showing "Featured" as current sort. Now need to scroll down to find the Brand filter sidebar to select HP.

00:00:53Memory: Step 3 of 6: Search completed successfully. Now on Amazon laptop search results page (over 100,000 results). Need to: (1) Find and apply Brand filter for HP, (2) Sort by Price Low to High, (3) Click cheapest laptop, (4) Add to cart, (5) Verify in cart. Current page shows sort dropdown but filters not yet visible - may be in left sidebar below.

00:00:53Next goal: Scroll down to reveal the left sidebar filters panel, locate the Brand filter, and select HP to narrow the results.

00:00:57Step 4:

00:01:03Agent is thinking...

00:01:10Step 5:

00:01:13Clicked a role=link "HP" aria-label=Apply HP filter to n

00:01:17Step 6:

00:01:23Agent is thinking...

00:01:29Step 7:

00:01:34Agent is thinking...

00:01:39Step 8:

00:01:46Eval: The previous step selected "Price: Low to High" from the sort dropdown. The dropdown now shows this option is selected. The page is currently loading results. Verdict: Partial success - sort was applied but results are still loading.

00:01:46Memory: Step 8 of 8 (max reached). On Amazon laptop search results page. Sort "Price: Low to High" has been applied (visible in sort dropdown). Page is loading results - two loading spinners visible. HP brand filter needs to be verified/reapplied if not showing. Plan: Wait for page load, check if results appear and if HP filter is applied, click cheapest laptop, add to cart, verify in cart.

00:01:46Next goal: Wait for the page to finish loading, then check if results are visible and verify the HP brand filter status before proceeding to select the cheapest laptop.

00:01:51Step 9:

00:01:58Agent is thinking...

00:02:05Step 10:

00:02:30Agent is thinking...

00:02:54Step 11:

00:03:02Agent is thinking...

00:03:10Step 12:

00:03:52Test run completed

AI Agent Readiness

62/100

Amazon.com demonstrates moderate readiness for AI agent interaction. Strengths include clear ARIA labels on filter options and generally predictable navigation patterns. Weaknesses include lack of structured data markup on product cards, missing llms.txt guidance, no WebMCP implementation detected, and performance delays during sort operations. The site is navigable for agents but could benefit significantly from semantic HTML improvements, structured data, and clearer loading state communication. The task was 50% complete (search, filter, sort applied) but blocked by agent output formatting errors before final steps could execute.

Optimization Recommendations

High

Performance

Add loading state feedback for sort operation

Medium

Aria

Add visible filter badge to confirm active filters

Medium

Semantic

Use semantic list elements for filter groups

Medium

Crawlability

Add llms.txt file for AI agent discovery

Low

Navigation

Add structured data markup for product listings

GPT 5.4

Waiting for stream to be ready...

Agent Logs

68 lines

00:00:00Test run started

00:00:00Agent execution starting...

00:00:11Navigated to https://amazon.com

00:00:14Step 1:

00:00:26Eval: Successfully landed on amazon.com homepage; verdict: Success.

00:00:26Memory: On Amazon homepage with search bar visible. No blocker yet; one non-blocking dynamic account flyout is open. Need to follow user’s exact sequence and later verify the chosen laptop appears in cart.

00:00:26Next goal: Search for 'laptop' using the Amazon search box and submit the search.

00:00:28Typed "laptop" into element with index 47

00:00:31Clicked input type=submit id=nav-search-submit-bu

00:00:31Step 2:

00:00:38Eval: Successfully searched for 'laptop' and reached Amazon results. Verdict: Success. ISSUE: sponsored content dominates the initial viewport, which may impair reliable agent navigation.

00:00:38Memory: Reached laptop search results. Need to apply Brand=HP before sorting, per user instructions. Issues noted so far: 1 warning about sponsored-first layout / dense dynamic result structure.

00:00:38Next goal: Locate the Brand filter and choose HP before sorting the results.

00:00:41Step 3:

00:00:49Agent is thinking...

00:00:57Step 4:

00:01:04Eval: The previous goal to locate the Brand filter succeeded; execution was interrupted by invalid JSON formatting rather than a site failure. Verdict: Success with tooling-format interruption.

00:01:04Memory: Amazon results page is open at the laptop search results with the Brands filter visible; HP filter is available at index 15815. One warning issue logged so far: sponsored-first layout/dense dynamic results can impair reliable agent navigation. Next I will apply HP, then sort low-to-high.

00:01:04Next goal: Click the HP option in the Brand filter to narrow results before sorting.

00:01:06Clicked a role=link "HP" aria-label=Apply HP filter to n

00:01:06Step 5:

00:01:14Eval: Clicked the HP brand filter, but the current page appears mid-render with blank results, so success is only partially verified. ISSUE: dynamic loading leaves the result area temporarily empty.

00:01:14Memory: Amazon results remain open after the HP click; the page now shows the results header and sort control but no visible products yet. Issues tracked: (1) sponsored/dense result layout, (2) dynamic re-render/blank results after filter application. Next I will wait for the filtered results to stabilize, then sort by Price: Low to High.

00:01:14Next goal: Wait for the Amazon results page to finish loading after applying the HP filter so I can confirm the filter and sort low-to-high.

00:01:19Step 6:

00:01:30Eval: Waiting for the filtered results page succeeded; the screenshot now shows stable HP results and the sort dropdown. Verdict: Success.

00:01:30Memory: Amazon search results are stable with the HP filter applied, and the sort control is available at index 29527. Issues tracked: 2 warnings so far — sponsored-first layout and transient blank result area during dynamic reload.

00:01:30Next goal: Set the sort order to 'Price: Low to High' so I can identify and open the cheapest HP laptop result.

00:01:32Step 7:

00:01:41Agent is thinking...

00:01:50Step 8:

00:01:58Agent is thinking...

00:02:05Step 9:

00:02:16Eval: The previous goal to sort results by Price: Low to High succeeded; the screenshot confirms the selected sort order. Verdict: Success.

00:02:16Memory: Amazon results are correctly filtered to HP and sorted low-to-high. I identified a new warning issue: the cheapest listed 'result' appears to be an accessory/sleeve mixed into laptop search results, so I will open the cheapest visible true laptop instead to satisfy the user request. Issues tracked: 3 warnings total.

00:02:16Next goal: Open the cheapest visible actual HP laptop result so I can add it to the cart.

00:02:18Clicked a "HP Chromebook 11A G6 11.6 Inch..."

00:02:21Step 10:

00:02:35Agent is thinking...

00:02:49Step 11:

00:02:58Eval: Opening the cheapest visible actual HP laptop product page succeeded; the screenshot confirms the selected HP Chromebook detail page. Verdict: Success.

00:02:58Memory: Now on the chosen product page for HP Chromebook 11A G6 at $59.00. Progress: steps through search/filter/sort/open are complete; remaining tasks are add-to-cart and cart verification. Issues tracked so far: 3 warnings (sponsored-first layout, transient blank reload, mixed accessory/laptop search results).

00:02:58Next goal: Click the Add to cart button to add the selected HP Chromebook to the cart and observe the confirmation state.

00:02:59Clicked input type=button id=add-to-cart-button name=submit.add-to-cart

00:03:02Step 12:

00:03:11Eval: The previous Add to cart click only partially succeeded because it triggered an upsell modal instead of directly confirming cart addition. Verdict: Partial success with interaction interruption.

00:03:11Memory: We are on the HP Chromebook product page with an 'Add to your order' modal blocking completion of add-to-cart. Core shopping flow is nearly complete; next must dismiss the modal, then navigate to cart and verify the selected laptop is present. Issues tracked: 4 warnings total including this modal interruption.

00:03:11Next goal: Dismiss the protection-plan modal by clicking 'No thanks' so I can verify the item was added and proceed to the cart.

00:03:13Clicked input type=submit

00:03:14Step 13:

00:03:23Agent is thinking...

00:03:32Step 14:

00:03:55Agent is thinking...

00:04:18Step 15:

00:04:27Eval: The previous goal succeeded: the upsell modal is no longer blocking, and the screenshot confirms the HP Chromebook was added to cart. Verdict: Success with warning-level interruption resolved.

00:04:27Memory: We have completed search, HP filter, low-to-high sort, chosen the cheapest visible actual HP laptop, and added it to cart. Current page is the add-to-cart confirmation/smart-wagon page; next we must open the cart and confirm the HP Chromebook 11A G6 is present. Issues tracked: 4 warnings total, no blockers.

00:04:27Next goal: Click 'Go to Cart' and verify that the HP Chromebook 11A G6 laptop is present in the shopping cart.

00:04:29Clicked a "Go to Cart"

00:04:29Step 16:

00:04:37Eval: Clicking 'Go to Cart' succeeded in navigation, but verification failed so far because the cart page content is still blank/loading. Verdict: Partial success with warning-level dynamic loading issue.

00:04:37Memory: Now at the Amazon cart URL, but the main content area is blank and not yet verifiable. Total tracked issues remain 4 warnings plus this continuing blank-load behavior affecting cart verification. Next step is to wait for the cart contents to render, then confirm the HP Chromebook entry is present.

00:04:37Next goal: Wait for the cart page to finish rendering so I can verify the selected HP Chromebook is present in the cart.

00:04:41Step 17:

00:05:21Eval: The wait succeeded and the cart contents are now fully visible; verification of the selected HP Chromebook in the cart is successful.

00:05:21Memory: Task completed end-to-end on Amazon. Verified final cart contains the selected HP Chromebook 11A G6 at $59.00; tracked 4 warning/info-level agent-friction issues and no unresolved blockers.

00:05:21Next goal: Finish the task by returning the required structured TestReport JSON.

00:05:29Test run completed

AI Agent Readiness

68/100

Amazon’s flow is largely automatable, but dense navigation, sponsored-result prominence, dynamic blank states, and modal interruptions create meaningful friction for AI agents. Core labels are often present, yet the shopping experience would benefit from clearer result semantics, more predictable state transitions, and higher-level agent APIs such as WebMCP.

Optimization Recommendations

High

Navigation

Reduce ambiguity between sponsored and core search results

High

Semantic

Separate accessories from laptop results for laptop-specific queries

Medium

Performance

Add clearer loading and filter-state feedback during result re-rendering

Medium

Aria

Make upsell interruption easier to dismiss programmatically

Medium

Webmcp

Expose high-value shopping actions via WebMCP

Low

Crawlability

Publish AI-oriented crawl guidance and structured commerce metadata

GPT 5.4 Mini

Waiting for stream to be ready...

Agent Logs

42 lines

00:00:00Test run started

00:00:00Agent execution starting...

00:00:08Navigated to https://amazon.com

00:00:13Step 1:

00:00:21Memory: Progress: 0/6 user steps completed. Observed a hover sign-in popup near Account & Lists; likely not a blocker. No issues recorded yet, but search/filter interactions may reveal labeling or dynamic-update friction.

00:00:21Next goal: Create a task checklist in todo.md, then search Amazon for "laptop" to begin the shopping flow.

00:00:24Typed "laptop" into element with index 24

00:00:26Clicked input type=submit id=nav-search-submit-bu

00:00:27Step 2:

00:00:44Step 3:

00:00:46Eval: I searched Amazon for "laptop" successfully; the results page is now visible with relevant filters, including HP.

00:00:46Memory: Progress: 1 of 5 checklist items completed. Current URL is the Amazon laptop results page. Observed useful accessibility labels on the brand filter links, but the page is busy and ad-heavy; no blocker so far.

00:00:46Next goal: Apply the HP brand filter to narrow the search results to HP laptops.

00:00:48Clicked a role=link "HP" aria-label=Apply HP filter to n

00:00:54Step 4:

00:01:03Step 5:

00:01:10Step 6:

00:01:13Memory: Progress: search completed and HP filter applied. Issue count so far: 0 blockers, 0 warnings, 1 minor interaction note (busy results page with heavy sponsored content). Next step is to sort by price ascending and then identify the cheapest HP laptop result.

00:01:13Next goal: Open the sort control and change it to Price: Low to High so I can identify the cheapest HP result next.

00:01:15Clicked span "Sort by: Featured"

00:01:18Step 7:

00:01:21Eval: I opened the Amazon sort control successfully; the dropdown is visible with sorting options, including "Price: Low to High". No blocker detected, though prior agent JSON formatting errors were noted in history.

00:01:21Memory: Current task progress: search complete, HP brand filter applied, sort menu open. Checklist status: step 2 underway. Key observation: Amazon exposes clear aria labels for filters and sort options, but sponsored results dominate the top of the page and may obscure the cheapest organic item.

00:01:21Next goal: Select "Price: Low to High" from the open sort menu so the results are ordered for cheapest-item selection.

00:01:22Clicked a role=option "Price: Low to High" id=s-result-sort-select

00:01:27Step 8:

00:01:40Step 9:

00:01:44Eval: The last step was to sort the results by Price: Low to High; that appears successful based on the current page state and screenshot, with the sort control showing the low-to-high selection.

00:01:44Memory: Progress: search completed, HP filter applied, sort set to Price: Low to High. One notable issue: the cheapest visible item appears to be an ambiguously named HP "Laptop Slv" product at $22, which may not be an actual laptop; I will use the visible cheapest result while validating cart contents afterward. No blockers encountered so far.

00:01:44Next goal: Click the Add to cart button for the cheapest visible HP result, then navigate to the cart and confirm the item is present.

00:01:45Clicked button "Add to cart" id=a-autoid-51-announce name=submit.addToCart aria-label=Add to cart

00:01:49Step 10:

00:01:51Eval: I clicked Add to cart for the cheapest visible HP result and it appears successful based on the open cart drawer with the item shown; verification in the full cart is still pending.

00:01:51Memory: Progress: 4/5 checklist items effectively done, with the cart drawer confirming the chosen HP item is in the cart context. No blockers encountered. Next step is to open the cart page and confirm the item name there before finalizing the structured test report.

00:01:51Next goal: Click Go to Cart in the cart drawer to open the cart page and verify the selected HP laptop is present.

00:01:53Clicked a "Go to Cart"

00:01:55Step 11:

00:02:04Eval: I clicked Go to Cart from the cart drawer and confirmed the browser is on the Amazon cart page with the HP item visible; success.

00:02:04Memory: Task is complete. The cart contains 1 item: HP SBUY Rnw Business 14.1 Laptop Slv at $22.00. Issue count remains 0 blockers, with only minor note about busy/ad-heavy results and dynamic UI behavior.

00:02:04Next goal: Compile and return the final structured TestReport JSON with the completed shopping flow and AI-agent optimization observations.

00:02:21Test run completed

AI Agent Readiness

78/100

Amazon is usable for AI agents and provided enough labeling to complete the requested shopping flow successfully. The main friction comes from dense, ad-heavy results pages and limited machine-readable product structure. ARIA labeling is comparatively strong, but crawlability and WebMCP readiness appear only moderate.

Optimization Recommendations

High

Aria

Maintain strong ARIA labeling for filters and sort controls

Medium

Navigation

Separate sponsored and organic results more clearly

Medium

Content

Make product names more explicit and machine-parsable

Low

Crawlability

Expose richer structured data for search result cards

GPT 5.4 Nano

Waiting for stream to be ready...

Agent Logs

31 lines

00:00:00Test run started

00:00:00Agent execution starting...

00:00:15Navigated to https://amazon.com

00:00:18Step 1:

00:00:25Eval: Homepage successfully loaded and ready for searching.

00:00:25Memory: Currently on https://www.amazon.com/. Search box available via shadow-open input index 24 and submit button indices 135/136. Need to: search 'laptop', filter Brand=HP, sort Price Low to High, add cheapest result to cart, then verify on cart page.

00:00:25Next goal: Search Amazon for 'laptop' using the header search input and submit the query.

00:00:28Typed "laptop" into element with index 24

00:00:31Clicked input type=submit id=nav-search-submit-bu

00:00:37Step 2:

00:00:41Eval: Previous action typing/clicking search submit succeeded in reaching the results page; we can see the results list and filtering/sorting UI now, so progress is successful.

00:00:41Memory: On Amazon search results for “laptop” (URL shows /s?k=laptop...). Need to apply Brand filter HP (apply link index 17763), then sort by Price: Low to High (sort select index 8700). After that, add cheapest HP laptop (need to use first result’s Add to cart after sorting) and verify on cart page (cart link index 9502 visible).

00:00:41Next goal: Apply the HP brand filter and then set the sort order to Price: Low to High.

00:00:43Clicked a role=link "HP" aria-label=Apply HP filter to n

00:00:49Step 3:

00:00:52Eval: Partially successful: the HP filter was clicked, but sorting has not yet been set to “Price: Low to High”, so the cheapest-result selection and cart validation are not ready.

00:00:52Memory: On Amazon results page (/s?k=laptop...) with HP-related results shown. HP filter click occurred previously, but we still need to set sort to Price Low to High, then add the cheapest HP laptop and validate it in cart. No cart navigation yet.

00:00:52Next goal: Set the sort dropdown to “Price: Low to High” so the first result becomes the cheapest, then we can add it to cart.

00:00:54Step 4:

00:00:57Memory: Progress so far: searched for “laptop”, applied HP filter (clicked HP filter link earlier), and set sort to “Price: Low to High”. Current state is likely mid-reload: screenshot shows only loading spinners, so we must wait for results grid/cards to appear before proceeding to add the cheapest HP laptop to cart.

00:00:57Next goal: Wait for the Amazon search results page to finish loading so product cards appear, then click the first (cheapest) HP laptop’s “Add to cart”.

00:01:10Step 5:

00:01:14Memory: Status: Search 'laptop' done; Brand filter HP applied; Sort set to 'Price: Low to High'. Results are rendered (not loading-only). Visible first laptop candidate is 'HP SBUY Rnw Business 14.1 Laptop Slv' priced $22.00 with Add to cart control index 46784. Next: add to cart, then open cart (cart link index 30548) and verify the same item is present.

00:01:14Next goal: Add the cheapest visible HP laptop result (“HP SBUY Rnw Business 14.1 Laptop Slv” at $22.00) to cart, then navigate to the cart page to verify it’s listed.

00:01:15Clicked span "Add to cart" id=a-autoid-24

00:01:17Clicked span

00:01:18Step 6:

00:01:30Memory: Cheapest HP laptop selected earlier: “HP SBUY Rnw Business 14.1 Laptop Slv” at $22.00. Currently verified on cart page where the same item appears and subtotal shows $22.00 (1 item). No further actions needed.

00:01:30Next goal: Finalize the test report JSON (including validation evidence and AI-agent optimization findings) and end the run.

00:01:38Test run completed

AI Agent Readiness

78/100

Task completion was successful, indicating generally workable navigation/controls, but agent reliability is reduced by transient loading states after sort/filter and by critical controls residing in shadow DOM.

Optimization Recommendations

Medium

Performance

Add explicit stable loading indicators and ARIA live updates for search results refresh

Medium

Semantic

Improve crawl/agent friendliness around shadow DOM search components

Low

Aria

Ensure icon/spans used as action targets expose clear accessible names

Gemini 3.1 Pro

Waiting for stream to be ready...

Agent Logs

42 lines

00:00:00Test run started

00:00:00Agent execution starting...

00:00:14Navigated to https://amazon.com

00:00:18Step 1:

00:00:26Eval: Successfully navigated to amazon.com.

00:00:26Memory: On Amazon homepage. Need to search for 'laptop'.

00:00:26Next goal: Search for 'laptop' using the search bar.

00:00:28Typed "laptop" into element with index 15

00:00:30Clicked input type=submit id=nav-search-submit-bu

00:00:31Step 2:

00:00:39Eval: Successfully executed the search for 'laptop' and reached the search results page.

00:00:39Memory: Searched for 'laptop' on Amazon. Currently on the results page. Need to find and apply the 'HP' brand filter.

00:00:39Next goal: Scroll down to locate the 'Brand' filter and select 'HP'.

00:00:42Step 3:

00:00:51Eval: Successfully scrolled down the page, making the 'Brands' filter section visible in the viewport. Verdict: Success.

00:00:51Memory: Searched for 'laptop' on Amazon. Scrolled down to find filters. The 'HP' brand filter is now visible on the left sidebar. Next step is to click it to filter the results.

00:00:51Next goal: Click the 'HP' brand filter to narrow down the search results to HP laptops.

00:00:53Clicked a role=link "HP" aria-label=Apply HP filter to n

00:00:57Step 4:

00:01:08Eval: Successfully clicked the HP brand filter. The page updated and the filter is now active. Verdict: Success.

00:01:08Memory: Search for 'laptop' completed. HP brand filter applied. Now proceeding to sort the results by price.

00:01:08Next goal: Sort the search results by 'Price: Low to High' using the sort dropdown.

00:01:10Step 5:

00:01:19Eval: Successfully selected the sort option 'Price: Low to High', but the page is currently loading the results. Verdict: Success.

00:01:19Memory: Sorted results by 'Price: Low to High'. The page is currently loading the new results. I need to wait for the products to appear.

00:01:19Next goal: Wait for the sorted product results to load on the page.

00:01:23Step 6:

00:01:35Eval: Successfully waited for the sorted results to load. The page now displays HP laptops sorted by price from low to high. Verdict: Success.

00:01:35Memory: Sorted HP laptops by price (Low to High). The cheapest result is a laptop sleeve for $22.00. I am proceeding to add it to the cart using the inline button.

00:01:35Next goal: Click the 'Add to cart' button for the cheapest result.

00:01:36Clicked button "Add to cart" id=a-autoid-22-announce name=submit.addToCart aria-label=Add to cart

00:01:40Step 7:

00:01:52Eval: Successfully added the cheapest item to the cart, as confirmed by the cart flyout and the updated cart count. Verdict: Success.

00:01:52Memory: Added the cheapest HP 'laptop' result (which is actually a sleeve) to the cart. The cart icon updated to show 1 item. Now proceeding to the cart page to validate.

00:01:52Next goal: Click the cart icon to navigate to the cart page.

00:01:54Clicked a "1 Cart" id=nav-cart aria-label=1 item in cart

00:01:55Step 8:

00:02:55Eval: Successfully navigated to the cart page and validated the item is present. Verdict: Success.

00:02:55Memory: The task is fully complete. The item is in the cart. Generating the final test report.

00:02:55Next goal: Call done with the generated test report.

00:03:01Test run completed

AI Agent Readiness

70/100

Amazon is generally navigable by AI agents due to clear element IDs and standard e-commerce patterns, but dynamic content updates (like sorting and autocomplete) require careful handling and explicit waits. Lack of WebMCP support means agents must rely entirely on DOM interaction.

Optimization Recommendations

Medium

Performance

Improve Sort Action Feedback

Low

Webmcp

Implement WebMCP for Search and Filter

Gemini 3 Flash

Waiting for stream to be ready...

Agent Logs

52 lines

00:00:00Test run started

00:00:00Agent execution starting...

00:00:16Navigated to https://amazon.com

00:00:19Step 1:

00:00:27Eval: Successfully navigated to amazon.com. Verdict: Success

00:00:27Memory: Started the test on Amazon.com. Next step is to search for 'laptop'. No issues detected yet.

00:00:27Next goal: Search for 'laptop' using the search bar.

00:00:29Typed "laptop" into element with index 17

00:00:31Clicked input type=submit id=nav-search-submit-bu

00:00:33Step 2:

00:00:38Eval: Successfully searched for 'laptop' and reached the results page. Verdict: Success

00:00:38Memory: Search for 'laptop' completed. Next step is to find and apply the 'HP' brand filter. I need to scroll to find the sidebar filters. No issues detected so far.

00:00:38Next goal: Scroll down to find the 'HP' brand filter in the sidebar.

00:00:42Step 3:

00:00:47Eval: Successfully scrolled down and located the HP brand filter. Verdict: Success

00:00:47Memory: Step 3 of 500. Located HP brand filter at index 15744. No issues detected so far. Next: Apply the filter and then sort by price.

00:00:47Next goal: Click the 'HP' brand filter to narrow the search results.

00:00:49Clicked a role=link "HP" aria-label=Apply HP filter to n

00:00:49Step 4:

00:00:55Eval: Successfully clicked the HP brand filter. Verdict: Success

00:00:55Memory: Step 4 of 500. HP brand filter applied. Next: Sort results by price (Low to High). No issues detected so far.

00:00:55Next goal: Sort the results by 'Price: Low to High' using the sort dropdown.

00:00:56Step 5:

00:01:01Eval: Successfully selected 'Price: Low to High' from the sort dropdown. Verdict: Success. However, the page is still loading the results.

00:01:01Memory: Step 5 of 500. Sorted results by Price: Low to High. Currently waiting for the results to load. No issues detected yet beyond standard loading time.

00:01:01Next goal: Wait for the sorted results to load so I can select the cheapest HP laptop.

00:01:06Step 6:

00:01:15Eval: Successfully sorted results by price and waited for the page to load. Verdict: Success.

00:01:15Memory: Step 6 of 500. Results sorted Low to High. Identified 'HP Chromebook 11A G6' at $59.00 as the cheapest laptop (ignoring the $22 sleeve). 1 issue noted: noise in search results.

00:01:15Next goal: Click on the cheapest HP laptop ($59.00) to view its details and add it to the cart.

00:01:17Clicked h2 "HP Chromebook 11A G6 11.6 Inch..." aria-label=HP Chromebook 11A G6

00:01:20Step 7:

00:01:25Eval: Successfully navigated to the product page for the HP Chromebook 11A G6. Verdict: Success.

00:01:25Memory: Step 7 of 500. On the product page for the cheapest HP laptop ($59.00). Next: Add to cart and then verify in the cart page. No blockers identified.

00:01:25Next goal: Click the 'Add to cart' button to add the laptop to the shopping basket.

00:01:26Clicked input type=submit id=add-to-cart-button name=submit.add-to-cart

00:01:29Step 8:

00:01:35Eval: Successfully clicked 'Add to cart', which triggered a protection plan upsell modal. Verdict: Success (modal expected).

00:01:35Memory: Step 8 of 500. Laptop added to cart, but blocked by a protection plan popup. I am about to click 'No thanks' (index 60566) to proceed to the cart verification step. Total issues detected: 1 (search noise).

00:01:35Next goal: Dismiss the protection plan popup by clicking 'No thanks' to proceed to the cart.

00:01:36Clicked input type=submit

00:01:37Step 9:

00:01:42Eval: Successfully dismissed the protection plan popup (or reached the confirmation page directly) and confirmed the item was added. Verdict: Success.

00:01:42Memory: Step 9 of 500. Laptop 'HP Chromebook 11A G6' ($59.00) added to cart. Currently on confirmation page. Issues: 1 (search noise). Next: Go to cart page.

00:01:42Next goal: Click 'Go to Cart' to navigate to the shopping cart and verify the item.

00:01:43Clicked a "Go to Cart"

00:01:45Step 10:

00:02:01Eval: Successfully navigated to the cart page and validated the presence of the correct laptop. Verdict: Success.

00:02:01Memory: Task completed. 10 steps total. Identified 2 warnings: search noise and upsell modal. Final item in cart: HP Chromebook 11A G6 ($59.00).

00:02:01Next goal: Provide the final test report and terminate the session.

00:02:09Test run completed

AI Agent Readiness

82/100

Amazon is highly functional for agents but contains significant 'friction' in the form of upsell modals and mixed-category search results. ARIA labels are generally good, but the lack of WebMCP and semantic product markers reduces efficiency.

Optimization Recommendations

High

Webmcp

Implement WebMCP for Cart Actions

Medium

Semantic

Improve Product Categorization in DOM

Gemini 3.1 Flash Lite

Waiting for stream to be ready...

Agent Logs

42 lines

00:00:00Test run started

00:00:00Agent execution starting...

00:00:10Navigated to https://amazon.com

00:00:14Step 1:

00:00:17Eval: Successfully navigated to Amazon. Verdict: Success

00:00:17Memory: Navigated to Amazon. Ready to search for 'laptop'.

00:00:17Next goal: Search for 'laptop' in the search bar.

00:00:19Typed "laptop" into element with index 43

00:00:22Clicked input type=submit id=nav-search-submit-bu

00:00:26Step 2:

00:00:29Eval: Successfully searched for 'laptop'. Verdict: Success.

00:00:29Memory: Search for 'laptop' completed. Now proceeding to filter results by 'HP' brand and then sort by price.

00:00:29Next goal: Locate and apply the 'HP' brand filter from the sidebar.

00:00:33Step 3:

00:00:36Eval: Successfully scrolled to reveal filters. Verdict: Success.

00:00:36Memory: I am on the Amazon search results page for 'laptop'. I have identified the 'HP' brand filter at index [17174]. My next step is to click this filter to narrow down the results.

00:00:36Next goal: Apply the 'HP' brand filter by clicking on the corresponding checkbox/link.

00:00:37Clicked a role=link "HP" aria-label=Apply HP filter to n

00:00:44Step 4:

00:00:46Eval: Applied the 'HP' brand filter successfully. Verdict: Success.

00:00:46Memory: Already searched for 'laptop' and applied 'HP' brand filter. The current page is displaying filtered results. Next, I will sort by 'Price: Low to High'.

00:00:46Next goal: Sort the current results by 'Price: Low to High' using the dropdown menu.

00:00:49Step 5:

00:00:52Eval: Sorting by price was initiated, but the page is still loading the results. Verdict: Success (in progress).

00:00:52Memory: Already searched for 'laptop', filtered by 'HP', and set sort order to 'Price: Low to High'. Currently waiting for the results to render on the page.

00:00:52Next goal: Wait for the search results to load, then identify the cheapest HP laptop.

00:00:59Step 6:

00:01:02Eval: Successfully sorted by price and identified the cheapest HP laptop. Verdict: Success.

00:01:02Memory: Found cheapest HP laptop ('HP SBUY Rnw Business 14.1 Laptop Slv') at $22.00. Ready to add to cart.

00:01:02Next goal: Add the cheapest HP laptop to the cart.

00:01:04Clicked button "Add to cart" id=a-autoid-40-announce name=submit.addToCart aria-label=Add to cart

00:01:08Step 7:

00:01:11Eval: Successfully added the item to the cart. Verdict: Success.

00:01:11Memory: Already added the cheapest HP laptop to the cart. Now navigating to the cart page to verify the item.

00:01:11Next goal: Navigate to the cart page and verify the item is present.

00:01:13Clicked a "Go to Cart"

00:01:14Step 8:

00:01:20Eval: Navigated to the cart page and confirmed the item is present. Verdict: Success.

00:01:20Memory: Task completed successfully. All steps from searching for 'laptop', filtering by 'HP', sorting by price, adding the cheapest item, and verifying in the cart were performed.

00:01:20Next goal: Call the done action to provide the final test report.

00:01:29Test run completed

AI Agent Readiness

85/100

Amazon is generally well-structured for AI agents, though dynamic content requires careful handling of wait states.

Optimization Recommendations

Low

Aria

Add ARIA labels to filter elements

Ready to See How AI Agents Experience Your Site?

Get a detailed report showing exactly where AI agents struggle on your site — and what to fix first.

Test It Now

Methodology note: All tests were conducted in March 2026 using FlowTester's browser-based testing framework on Amazon.com. Each model ran the identical 6-step task in a clean browser environment. Costs are calculated from each provider's published API pricing at the time of testing. Token counts include all agent-browser interactions. Scores are FlowTester's composite metric combining task completion, accuracy, and report quality.