Skip to content

Closing the loop

Efficient agent workflows depend on a closed feedback loop. Agents should be able to gather signals from tests, logs, and runtime checks continuously, without waiting for manual input on every step.

The tips below focus on building that loop so agents can diagnose and fix issues more autonomously.

  1. Make sure your agent writes tests for any regression it finds before attempting to fix actual code. If it doesn’t do this by itself, consider telling it so in your AGENTS.md.
  2. Pay attention to how models assert the expected state. Many models tend to write leaky assertions, which only catch the exact issue they just reasoned about.
  1. Make your app tee logs to a *.log file. This will allow agents to observe runtime behavior. Models are also good at adding their own temporary logs while debugging.
  2. Make it easy for an agent to connect to your database via psql or sqlite3. You can even use this interface in place of database GUIs.
  3. Tidewave .
  1. Models are trained a lot on Bash. They breathe it and are very productive when they can process data through shell one-liners or quick Python scripts.
  2. If you build a quick library for some remote API:
    1. Try to make it easy for agents to play with this API.
    2. In a non-interactive language (Go, Rust, Swift, etc.), consider asking your agent to whip up a quick CLI on top of your code.
    3. In JS, Python, Elixir, or Ruby, agents can efficiently use REPLs or one-off scripts.

Also, take a look at these skills:

  • tmux skill Armin Ronacher. 2026-01-23 - useful if you make/use interactive CLIs. Agents are pretty good at using GDB/LLDB via tmux.

Current frontier models are surprisingly capable of browsing websites, clicking around, and observing what happens, provided they are given the right tools.

Try using one of these tools:

Make it easy for agents to spawn a new instance of the app by themselves. Ideally on a separate port so that multiple agents can work in parallel.

Also, take a look at these skills:

You can also tell the agent to play with a phone simulator.

Take a look at:

Also, have a look at these skills: