Zero Trust in a world of autonomous AI agents

The identity problem

Traditional authentication assumes a bounded session with a known principal. A human logs in, gets a token, does work, logs out. The token has a clear lifetime and the principal has a stable identity.

AI agents break every one of these assumptions.

An orchestrator agent spawns sub-agents dynamically. Each sub-agent may spawn further agents. Sessions are unbounded — an agent might run for hours or days. Privileges change mid-task as the agent's context evolves. And the "principal" is a chain of delegation that can be arbitrarily deep.

The question isn't whether to apply Zero Trust to AI agents. It's how to make it practical.

Short-lived credentials over long-lived keys

The first principle is simple: never issue a credential that outlives its task.

API keys are the worst offender. They're typically long-lived, broadly scoped, and impossible to trace through delegation chains. When an agent passes an API key to a sub-agent, you've lost all visibility into who is actually using it.

Instead, issue cryptographic certificates with explicit TTLs. A sub-agent spawned for a five-minute scan gets a certificate that expires in ninety seconds, renewable only by the agent that spawned it. When the certificate expires, access is automatically revoked — no cleanup required.

Certificate:
  Subject: spiffe://cluster/agent/scanner-7f2a
  Issuer:  Vörðr Credential Broker
  TTL:     90s
  Lineage: scanner-7f2a → orchestrator-3b1c → analyst:sarah

Intent validation

Identity answers "who are you?" but for AI agents, you also need to ask "what are you trying to do?"

Every agent declares a manifest before execution: what it intends to access, what operations it will perform, and what constraints apply. The policy engine validates each action against this manifest in real time.

When a prompt injection redirects an agent mid-task — say, from reading a payments table to posting data to an external webhook — the manifest validator catches the deviation. The action is blocked before it executes, and human escalation is triggered.

This is the difference between access control and intent control. Access control asks if you're allowed to touch a resource. Intent control asks if touching that resource is consistent with what you declared you'd do.

Lineage as audit trail

Every credential carries a lineage token — a cryptographic chain tracing delegation back to a human principal. When sub-agent scanner-7f2a accesses the payments database, the database can verify not just the agent's identity, but the full chain: scanner → orchestrator → Sarah.

This makes forensics tractable. If something goes wrong, you can trace exactly which human authorised the chain, which orchestrator spawned the sub-agent, and what the declared intent was at each level.

What this means in practice

Zero Trust for AI agents isn't a product — it's an architecture. The components are:

SPIFFE-based identity fabric for agent-to-agent authentication
Short-lived certificates with automatic expiry
Manifest validation checking actions against declared intent
OPA policy enforcement for fine-grained access decisions
Lineage tracking for end-to-end auditability
Human escalation gates for policy violations

The goal isn't to prevent agents from working. It's to ensure that when they work, they do exactly what was authorised — nothing more, nothing less.