Closing the loop
Efficient agent workflows depend on a closed feedback loop. Agents should be able to gather signals from tests, logs, and runtime checks continuously, without waiting for manual input on every step.
The tips below focus on building that loop so agents can diagnose and fix issues more autonomously.
Tests, tests, tests
Section titled “Tests, tests, tests”- Make sure your agent writes tests for any regression it finds before attempting to fix actual code. If it doesn’t do this by itself, consider telling it so in your AGENTS.md.
- Pay attention to how models assert the expected state. Many models tend to write leaky assertions, which only catch the exact issue they just reasoned about.
Backend logs and database access
Section titled “Backend logs and database access”- Make your app tee logs to a
*.logfile. This will allow agents to observe runtime behavior. Models are also good at adding their own temporary logs while debugging. - Make it easy for an agent to connect to your database via
psqlorsqlite3. You can even use this interface in place of database GUIs. - Tidewave .
Leverage CLIs
Section titled “Leverage CLIs”- Models are trained a lot on Bash. They breathe it and are very productive when they can process data through shell one-liners or quick Python scripts.
- If you build a quick library for some remote API:
- Try to make it easy for agents to play with this API.
- In a non-interactive language (Go, Rust, Swift, etc.), consider asking your agent to whip up a quick CLI on top of your code.
- In JS, Python, Elixir, or Ruby, agents can efficiently use REPLs or one-off scripts.
If you’re building a CLI that agents will use, design it for non-interactive use from the start:
- Make it non-interactive. Every input needs a flag equivalent. Don’t drop into interactive prompts mid-execution — agents can’t press arrow keys.
- Make
--helpuseful. Include examples in every subcommand’s help output. Agents pattern-match off examples faster than they read descriptions. - Accept stdin. Agents think in pipelines and want to chain commands. Don’t require positional args in unusual orders.
- Fail fast with actionable errors. If a required flag is missing, show the correct invocation immediately. Agents self-correct well when you give them something to work with.
- Make commands idempotent. Agents retry often. Running the same command twice should be a no-op, not a duplicate action.
- Add
--dry-runfor destructive actions. Let agents validate a plan before committing to it. - Add
--yesto skip confirmations. Make the safe path the default, but allow bypassing it. - Use a predictable command structure. Pick a pattern (e.g. resource + verb) and use it everywhere.
If an agent learns
mycli service list, it should be able to guessmycli deploy list. - Return data on success. Output IDs and URLs, not just a success message.
Read more:
- Building CLIs for agents Eric Zakariasson. 2026-03-25
Also, take a look at these skills:
- tmux skill Armin Ronacher. 2026-01-23 - useful if you make/use interactive CLIs. Agents are pretty good at using GDB/LLDB via tmux.
Automating web frontend QA
Section titled “Automating web frontend QA”Current frontier models are surprisingly capable of browsing websites, clicking around, and observing what happens, provided they are given the right tools.
Try using one of these tools:
- Cursor Browser
- agent-browser Vercel. 2026-01-16
- Claude in Chrome
Make it easy for agents to spawn a new instance of the app by themselves. Ideally on a separate port so that multiple agents can work in parallel.
Also, take a look at these skills:
- vercel-react-best-practices skill Vercel. 2026-01-16
Automating mobile app QA
Section titled “Automating mobile app QA”You can also tell the agent to play with a phone simulator.
Take a look at:
- Radon AI
- XcodeBuildMCP Sentry.
- callstackincubator/agent-device: CLI to control iOS and Android devices for AI agents Callstack.
Also, have a look at these skills:
- expo/skills Expo.
- react-native-best-practices skill Callstack.