You Wouldn't Let Criminals Control Your Pacemaker

Most discussions about AI safety still happen at the wrong layer.

People talk about prompt engineering, moderation, jailbreak resistance, alignment, filters, and whether the model is "smart enough" to avoid doing harmful things.

I think that framing is deeply confused.

If an AI agent can actually do things in your system, then the real question is not whether it sounds trustworthy.

The real question is this.

What is it authorized to do?

That is the question that matters.

Because if an AI agent can touch your files, your APIs, your browser sessions, your database, your users, or your business logic, then you are no longer dealing with a text-generation problem.

You are dealing with an authority problem.

And authority problems are not solved with prompts.

They are solved with access control.

The pacemaker test

I sometimes think about this in very simple terms.

You would not let criminals control your pacemaker.

You would not connect a life-critical device to an untrusted actor and then hope they follow instructions.

You would not secure it with a note saying please do not behave maliciously.

You would secure it by making sure they do not have the authority to control it in the first place.

That is exactly how I think about AI agents.

If your architecture allows an untrusted model, or a manipulated prompt, or a compromised agent workflow to operate backend capabilities without hard authorization boundaries, then your system is insecure by design.

Not because the model is evil.

But because your security model is fantasy.

AI does not need to be malicious to be dangerous

This is another place where people get distracted.

The danger is not that the AI has evil intent.

The danger is that it has capability without sufficiently constrained authority.

A model can hallucinate. A model can misunderstand. A model can be prompt injected. A model can overgeneralize. A model can confidently do the wrong thing.

None of that is surprising.

What matters is whether the system lets those mistakes cross a security boundary.

That is where role-based access control becomes non-negotiable.

Authentication is not enough

A lot of systems stop thinking the moment they have identity.

They authenticate a user. They get a token. They confirm the request came from someone real. And then they blur straight past the next question.

What is this principal actually allowed to do?

That is the question RBAC answers.

Not who you are.

What you are allowed to do.

Those two things are not the same.

In fact, a lot of AI agent systems seem to behave as though authenticated means trusted, and trusted means broadly empowered.

That is insane.

If a user is authenticated, that tells me they have an identity.

It does not tell me they should be able to:

mutate records
create tools
inspect protected data
trigger workflows
generate backend logic
access administrator-only operations

Those things require authorization.

And if an AI agent is acting on behalf of that user, then the exact same rule applies to the agent.

Actually, it applies even more.

I do not trust AI with authority by default

This is one of the strongest conclusions I have reached while building systems around AI.

I do not trust the model to decide what should be allowed. I do not trust natural language to define permissions. I do not trust prompts to serve as an access-control boundary. And I definitely do not trust an agent with broad backend capabilities simply because the UX feels magical.

The model is not the authority.

The runtime is the authority. The security system is the authority. The role model is the authority.

That is why RBAC matters so much.

Because RBAC gives me an explicit, enforceable answer to the only question that really matters:

Is this operation allowed for this principal in this context?

If the answer is no, then it should not matter how persuasive the prompt was.

This is where most AI agent architecture goes wrong

A lot of modern AI demos are built around the most dangerous possible idea.

Give the model tools. Give it data. Give it some vague policies. Tell it to behave. Hope for the best.

That is not architecture.

That is wishful thinking.

If the model can invoke backend capabilities, then every tool invocation is a security decision.

Every generated action is a permissions question.

Every endpoint call is an authorization event.

And if your system is not built around that fact, then your AI agent is just an overprivileged process wearing a chat UI.

That may be good enough for a demo.

It is not good enough for production.

RBAC is what turns AI from reckless into usable

This is why I keep coming back to RBAC.

Role-based access control is not some boring enterprise afterthought.

It is the thing that makes AI agents survivable.

It gives me a way to separate:

what the user wants
what the AI suggests
what the runtime permits

That separation is everything.

A user can ask for anything. The AI can attempt anything. But the system must only allow what the current identity and role set is actually permitted to perform.

That is how you stop an AI agent from becoming a security incident.

Not by hoping it remains polite.

By making sure it cannot cross authority boundaries.

The model should inherit constraints, not invent them

If an AI agent acts on behalf of a user, then the agent should inherit the user's constraints.

Not bypass them. Not reinterpret them. Not creatively expand them.

If a user is allowed to read records but not delete them, then the AI agent should not be able to delete them either.

If a user is not an admin, then the AI agent should not somehow become one through prompt wording.

If a role does not permit access to an endpoint, then the AI must not be able to route around that simply because it found a clever sequence of words.

This sounds obvious when stated plainly.

And yet a lot of current systems behave as though the agent itself is a special class of actor with blurred permissions and magical exemptions.

That is a terrible idea.

The AI should be more constrained than the user, not less.

Security starts where the prompt stops

This is really the heart of my position.

Prompts can express intent. Prompts can improve UX. Prompts can influence behavior.

But prompts are not enforcement.

Enforcement starts when the system evaluates whether the requested operation is authorized.

That means:

authenticated principal
explicit roles
verified token
enforced endpoint requirements
capability restriction
runtime denial when access is not allowed

That is what real security looks like.

Everything before that is just conversation.

If your AI agent can do damage, its permissions matter more than its intelligence

This is another thing I think people need to hear more often.

I do not actually care that much whether the model is brilliant.

I care whether it is constrained.

An extremely intelligent agent with weak authorization boundaries is dangerous.

A mediocre agent with strong authorization boundaries is manageable.

That is the tradeoff that matters in production.

Not benchmark scores. Not demo quality. Not how convincing the generated text sounds.

What matters is whether the system can prevent unauthorized action.

My rule of thumb

If I would not let an untrusted stranger do it manually, then I should not let an AI agent do it automatically.

That applies to:

production databases
payment systems
customer records
internal documents
browser automation
backend workflows
code generation
administrative actions

And the fix is not complicated conceptually.

Do not let the AI define authority.

Make authority explicit. Make it role-based. Make it enforceable. Make denial the default. Make elevation deliberate.

That is what RBAC is for.

My position

I do not think AI agents become safe because the model is aligned.

I think they become safer when the system is architected so that identity, roles, and authorization boundaries remain in charge no matter what the model says.

That is why I care so much about RBAC.

Because once AI moves from answering questions to executing operations, security stops being about language and starts being about authority.

And authority must be explicit.