this was a great read. it’s really interesting to watch the push of agents getting i to production and the struggles associated with them. it’s definitely going to be a year of lessons learned!
the leaks are real. i’m seeing exactly what the paper “agents of chaos” is discussing.
long form context agents drift and are very hard to secure.
agents are based on llms for reasoning, and llms are trained to be helpful. you nailed it context leaking is real.
the last one is a side affect of faster build cycles. everything is over provisioned and connected, so integration attacks on things like mcp are huge.
This is the exact loop we removed from governance entirely in AOS.
Every Think → Act → Observe cycle is another LLM call — another chance to hallucinate, another $0.03-$0.15, another 400ms-2.3s. Compound that across 8+ loops and you get the $47K incident you flagged.
Our approach: the agent emits an intent payload. A process-isolated Deterministic Policy Gate evaluates it against compiled rules — no LLM call, no inference, no polling. Sub-100ms, fractions of a penny. The decision is written to a Merkle-tree authenticated ledger.
The polling tax disappears when governance is deterministic infrastructure, not another model in the loop.
this was a great read. it’s really interesting to watch the push of agents getting i to production and the struggles associated with them. it’s definitely going to be a year of lessons learned!
What's the biggest failure pattern you're seeing from the security side, agents leaking context or something else entirely?
the leaks are real. i’m seeing exactly what the paper “agents of chaos” is discussing.
long form context agents drift and are very hard to secure.
agents are based on llms for reasoning, and llms are trained to be helpful. you nailed it context leaking is real.
the last one is a side affect of faster build cycles. everything is over provisioned and connected, so integration attacks on things like mcp are huge.
The MCP integration attack surface is the one nobody's talking about yet. Are teams even auditing their server permissions or is it all wide open?
This is the exact loop we removed from governance entirely in AOS.
Every Think → Act → Observe cycle is another LLM call — another chance to hallucinate, another $0.03-$0.15, another 400ms-2.3s. Compound that across 8+ loops and you get the $47K incident you flagged.
Our approach: the agent emits an intent payload. A process-isolated Deterministic Policy Gate evaluates it against compiled rules — no LLM call, no inference, no polling. Sub-100ms, fractions of a penny. The decision is written to a Merkle-tree authenticated ledger.
The polling tax disappears when governance is deterministic infrastructure, not another model in the loop.
http://governanceforwp.com if you want to see it working in production on WordPress.
Sub-100ms on the policy gate is impressive. How gnarly are the rule sets getting in production, simple allow/deny or something way more nuanced?