4 Comments
User's avatar
Gal Dayan's avatar

What makes this work is hiding in the diagram: every step lands in a sandbox, and the output is a review, not a merge. The agent does real work, but a human still presses the button, and the worst case is a comment you ignore. That is the most forgiving place an acting agent can live. The harder question is what this same loop looks like when there is no merge button in front of it, when the action lands in the world the moment the agent decides. The sandbox is doing a lot of quiet work here.

Paolo Perrone's avatar

The no-merge-button case is the scary one. No sandbox means the agent has to be right the first time. Which domain loses it first?

Gal Dayan's avatar

The domain where the action and someone feeling it happen in the same instant. A wrong merge sits in a repo until someone reads it, so there is still a gap to catch it. A wrong call is already in someone’s ear before you know it went out. A wrong message is already read. Money feels scarier but it has had decades of clawbacks and holds built around it. The truly exposed domain is live contact with a real person, because there is no sandbox version of “we already spoke to your customer.” It is not the highest-stakes domain that loses first, it is the least reversible one.

Paolo Perrone's avatar

The no-merge-button case is the scary one. No sandbox means the agent has to be right the first time. Which domain loses it first?