Practical browser automation recipes — login flows, form filling, data extraction, and more
Common patterns for browser automation. These examples show the agent’s workflow — you just describe what you need in plain language, and the agent handles the tool calls.
The agent can handle multi-field forms with dropdowns, checkboxes, and text areas.
"Fill out the support ticket form on our internal tool — set priority to High,category to Billing, and describe the issue as 'Customer charged twice for subscription'"
The agent snapshots the form, identifies each field by its label, fills text inputs, selects dropdown values, and clicks submit.
See what API calls a page is making — useful for debugging or discovering internal endpoints.
"Open the dashboard, click the refresh button, and show me what API calls it makes"
The agent uses network inspection to capture all HTTP requests the page triggers, showing URLs, methods, status codes, and response sizes. This is especially powerful for discovering internal APIs.
Combine multiple browser actions into complex workflows.
"Go to our HR portal, check each team member's profile, and compile theirjob titles and departments into a spreadsheet"
For tasks like this that involve iterating through many items, the agent often discovers a more efficient approach — see API Discovery for how the agent can find internal APIs and fetch all data in parallel instead of clicking through one by one.
Tips for best results
Be specific about what you want — “Extract the employee names and emails from the table” is better than “get the data”
Mention if you’re already logged in — saves time skipping the login flow
Describe the page structure if it’s complex — “the data is in the second tab, under the Summary section”
Ask for a specific output format — “put it in a spreadsheet” or “format as a table”
Troubleshooting
Page not loading? — The agent will retry navigation. If it keeps failing, check that the URL is correct and the site is accessible.
Can’t find an element? — The agent re-snapshots the page after navigation. If elements load dynamically, it may need to wait or scroll first.
Login not working? — Some sites use CAPTCHAs or multi-factor auth that the browser can’t automate. You may need to log in manually first, then let the agent continue.
Interactions seem flaky? — The agent will re-snapshot and retry. Dynamic pages with animations may need a brief wait between actions.