[8/4] How a Scotsman saved hours of my time by turning an LLM into my virtual assistant

Half story, half educational inspiration for your next browser automation journey.
[8/4] How a Scotsman saved hours of my time by turning an LLM into my virtual assistant
In: AI

Sometimes you get stuck thinking about less-than-ideal solutions, missing the big green elephant staring right at you. I had a nice win in a short time that I didn't want to leave unshared.

I had used a bookkeeping app for several years to file my taxes. Once I decided to switch accountants, I found that this application isn't very interested in you leaving, so it doesn't offer an "export all functionality" option. (a black-hat churn buster right there for you SaaS builders!)

Here are a couple of these screens:

Basically, I stood before a gigantic mechanical task: clicking through dozens of pages and downloading hundreds of invoices I had uploaded manually for the past years.

My first thought was that there is no way I'll want to make time for that. But if I don't do it, who will? A virtual assistant, of course! Just recently, I spoke with an entrepreneur buddy of mine who works in the virtual assistant business. He spoke very highly of virtual assistants from the Philippines and recommended a Facebook group for finding trusted people.

So the solution was obvious: find a virtual assistant! I still hesitated, though, because there was another daunting task: finding the right one you could trust to pay for the job and to share your accounting data and passwords with.

Luckily, a Scottish man came to Barcelona for a day of co-working and enlightened me:

"Why don't you use a browser script or ChatGPT Atlas?" - A Scotsman. 🏴󠁧󠁒󠁳󠁣󠁴󠁿

There it dawned on me. What kind of an engineer was I? My recent conversations got me stuck thinking like a businessman, a very lousy one, too, pre-November 2022 at the very least.

ChatGPT Atlas just launched, but it had two fatal flaws at the time: No image upload and no way to download an image.

We prioritized safety as we built ChatGPT’s agent capabilities in Atlas, and added safeguards to address new risks that can come from access to logged-in sites and browsing history while taking actions on your behalf, for example:
- It cannot run code in the browser, download files, or install extensions
- It cannot access other apps on your computer or file system

Basically, it couldn't do anything.

So I took Claude Desktop from its dusty shelf and started vibing. I was close to getting frustrated on the fifth prompt, but Claude convinced me to push on!

I believed the robot, took a step back and asked it to make it work for one row first. After which, it was clear that success is imminent.

We also both got a bit too excited as the vibe continued in a positive direction:

Here's the whole vibe trip. You can skim through my 12 short prompts on the right to see the story unfold:

Browser button click automation
Shared via Claude, an AI assistant from Anthropic

Turned out that the virtual assistant was Claude after all.

So a laborious task turned into a fun engineering challenge. I tracked down the total prompting and thinking time to 25 minutes, which is nothing compared to hours of rote doom-clicking through a 1000 links or interviewing people. Obviously, a virtual assistant is still great for other tasks, but in the end, a good assistant would have vibed out a script like this as well.

The lesson here is that there is nothing like solving a task by guiding the LLM with small, gradually increasing steps and having a Scotsman on your side! (thank you, bud 🧑)

πŸ’‘
P.S.: In my engineering practice, I'm turning away from prompting, towards agentic workflows. I try not to prompt or micro-manage the agent too much for the initial task. Instead, I figure out a PLAN.md with success criteria and let the agent do its thing until it's done. I recently started using the Playwright MCP server after a recommendation from an American man for browser-based success criteria. In hindsight, I would have used that with Claude Code instead and let it figure things out itself. But this is a topic for the next post.
Comments
More from RichStone Input Output
Great! You’ve successfully signed up.
Welcome back! You've successfully signed in.
You've successfully subscribed to RichStone Input Output.
Your link has expired.
Success! Check your email for magic link to sign-in.
Success! Your billing info has been updated.
Your billing was not updated.