Closing the loop
Efficient agent workflows depend on a closed feedback loop. Agents should be able to gather signals from tests, logs, and runtime checks continuously, without waiting for manual input on every step.
The tips below focus on building that loop so agents can diagnose and fix issues more autonomously.
Tests, tests, tests
Section titled “Tests, tests, tests”- Make sure your agent writes tests for any regression it finds before attempting to fix actual code. If it doesn’t do this by itself, consider telling it so in your AGENTS.md.
- Pay attention to how models assert the expected state. Many models tend to write leaky assertions, which only catch the exact issue they just reasoned about.
Backend logs and database access
Section titled “Backend logs and database access”- Make your app tee logs to a
*.logfile. This will allow agents to observe runtime behavior. Models are also good at adding their own temporary logs while debugging. - Make it easy for an agent to connect to your database via
psqlorsqlite3. You can even use this interface in place of database GUIs. - Tidewave .
Leverage CLIs
Section titled “Leverage CLIs”- Models are trained a lot on Bash. They breathe it and are very productive when they can process data through shell one-liners or quick Python scripts.
- If you build a quick library for some remote API:
- Try to make it easy for agents to play with this API.
- In a non-interactive language (Go, Rust, Swift, etc.), consider asking your agent to whip up a quick CLI on top of your code.
- In JS, Python, Elixir, or Ruby, agents can efficiently use REPLs or one-off scripts.
Also, take a look at these skills:
- tmux skill Armin Ronacher. 2026-01-23 - useful if you make/use interactive CLIs. Agents are pretty good at using GDB/LLDB via tmux.
Automating web frontend QA
Section titled “Automating web frontend QA”Current frontier models are surprisingly capable of browsing websites, clicking around, and observing what happens, provided they are given the right tools.
Try using one of these tools:
- Cursor Browser
- agent-browser Vercel. 2026-01-16
- Claude in Chrome
Make it easy for agents to spawn a new instance of the app by themselves. Ideally on a separate port so that multiple agents can work in parallel.
Also, take a look at these skills:
- vercel-react-best-practices skill Vercel. 2026-01-16
Automating mobile app QA
Section titled “Automating mobile app QA”You can also tell the agent to play with a phone simulator.
Take a look at:
- Radon AI
- XcodeBuildMCP Sentry.
- callstackincubator/agent-device: CLI to control iOS and Android devices for AI agents Callstack.
Also, have a look at these skills:
- expo/skills Expo.
- react-native-best-practices skill Callstack.