6 Comments
User's avatar
ToxSec's avatar

this was a great read. it’s really interesting to watch the push of agents getting i to production and the struggles associated with them. it’s definitely going to be a year of lessons learned!

Paolo Perrone's avatar

What's the biggest failure pattern you're seeing from the security side, agents leaking context or something else entirely?

ToxSec's avatar

the leaks are real. i’m seeing exactly what the paper “agents of chaos” is discussing.

long form context agents drift and are very hard to secure.

agents are based on llms for reasoning, and llms are trained to be helpful. you nailed it context leaking is real.

the last one is a side affect of faster build cycles. everything is over provisioned and connected, so integration attacks on things like mcp are huge.

Paolo Perrone's avatar

The MCP integration attack surface is the one nobody's talking about yet. Are teams even auditing their server permissions or is it all wide open?

Gene Salvatore's avatar

This is the exact loop we removed from governance entirely in AOS.

Every Think → Act → Observe cycle is another LLM call — another chance to hallucinate, another $0.03-$0.15, another 400ms-2.3s. Compound that across 8+ loops and you get the $47K incident you flagged.

Our approach: the agent emits an intent payload. A process-isolated Deterministic Policy Gate evaluates it against compiled rules — no LLM call, no inference, no polling. Sub-100ms, fractions of a penny. The decision is written to a Merkle-tree authenticated ledger.

The polling tax disappears when governance is deterministic infrastructure, not another model in the loop.

http://governanceforwp.com if you want to see it working in production on WordPress.

Paolo Perrone's avatar

Sub-100ms on the policy gate is impressive. How gnarly are the rule sets getting in production, simple allow/deny or something way more nuanced?