Beyond Fault Lines

Can we Trust Zero Trust?

Munib Shah — Fri, 23 Jan 2026 02:47:04 GMT

Zero trust has become one of those phrases that feels both unavoidable and unquestionable. It shows up in board decks, vendor roadmaps, and architectural diagrams as a design philosophy. “Never trust, always verify” sounds timeless, almost axiomatic. But axioms only hold within the worlds they were built for.

What I find myself wondering lately isn’t whether zero trust is wrong, but whether we’ve kept its definition fixed while the environment around it has changed. The assumptions that shaped zero trust were grounded in a very different operational reality, and the systems we’re building now behave in ways those assumptions never had to account for. To understand how zero trust needs to evolve, it helps to look at the world it was built for and the one we’re applying it to today.

The World Zero Trust Was Built For

When I look at this diagram, I’m reminded of how comforting traditional network security used to feel. There was an outside and an inside, and the line between the two mattered. The internet sat on one side, the enterprise lived on the other, and everything meaningful flowed through a small number of very intentional choke points. If we could see those choke points, control them, and monitor them, we had a handle on risk.

The next generation firewall was the star of that world. It was the place where intent was inspected, policies were enforced, and trust was withheld. North–south traffic came was scrutinized, decrypted if necessary, and either allowed or dropped. East–west traffic stayed largely contained within known segments. We knew where the endpoints were, what servers lived where, and which routes were acceptable. The network topology told a story we could reason about.

Defense-in-depth made sense. Wach layer had a defined role. Endpoints carried anti-malware, EDR, and DLP to catch what slipped through. Servers ran agents that watched processes and file access. Network devices enforced segmentation and limited blast radius. Data lived close to the systems that used it, and when it moved, it followed paths we could diagram, document, and defend. Even when something went wrong, we could usually reconstruct the sequence of events by walking the flow in reverse.

This model wasn’t perfect, and it accumulated plenty of technical debt over time, but it was legible. Systems did what they were built to do. Data moved where it was explicitly allowed to move. Trust was reduced, verified, and continuously re-evaluated at boundaries we understood. Zero trust, in this context was natural evolution of everything we already believed about networks - tighter controls, better visibility, and fewer assumptions.

The World that we live in now

Compare this to today’s world. There is no longer an obvious perimeter where trust decisions naturally belong. Instead of traffic flowing between well-defined zones, everything seems to orbit around the AI agent itself. It becomes the focal point where data, context, and intent converge, and that alone changes how we think about control.

The agent doesn’t just consume data the way applications used to. It pulls from memory, queries vector databases, reaches into traditional data stores, and then decides what to do next based on what it learns. Those decisions aren’t hardcoded paths we can map ahead of time; they’re shaped in real time by prompts, intermediate results, and evolving goals. The system is no longer following a script but reasoning its way forward.

What unsettles me most is how action happens. The agent doesn’t need broad, direct access to everything. It operates through tools that already have permissions: APIs, code execution environments, and even direct access to the internet. Each tool call can be perfectly valid and policy-compliant on its own, inspected by L7 controls and endpoint security, yet the chain of actions they form can produce outcomes we never explicitly designed or approved.

This is where the old zero trust instincts start to fray. Trust isn’t violated at a single boundary but accumulated across many small, legitimate decisions. Data doesn’t obviously “move” so much as it gets reassembled through delegation. The model is still layered and controlled, but it’s no longer fully legible. In this world, zero trust doesn’t feel wrong but it feels incomplete in ways we didn’t have to confront before.

Trust Without a Choke Point

Zero trust was never just about mistrust. It was about placing trust carefully at points where it could be verified, constrained, and revoked. The problem we face now isn’t that we’ve abandoned those principles, but that the places where trust naturally accumulates have shifted.

When reasoning systems act through chains of delegation trust becomes emergent rather than explicit. When intent is inferred rather than declared, verification becomes probabilistic. And when outcomes arise from composition instead of design, the old question “Where do we enforce?” - no longer has a clean answer.

So can we trust zero trust? Maybe the better question is whether zero trust can evolve fast enough to remain meaningful in systems that no longer have clear choke points, stable intent, or legible flows. If we keep applying it as if nothing fundamental has changed, we risk mistaking policy compliance for control—and visibility for understanding.

Zero trust still matters. But trusting it blindly might be the one thing it was never meant to allow.

Assume the Model will be social engineered. Design so that it doesn’t matter

Munib Shah — Thu, 15 Jan 2026 19:46:19 GMT

There’s a quiet assumption baked into a lot of modern artificial intelligence work that doesn’t get talked about enough. We act as if models will mostly behave as intended. As if clever system prompts and guardrails will be enough to keep them pointed in the right direction.

Text within this block will maintain its original spacing when published

Example Prompt: Do not assist with harmful, illegal, or unethical requests. If a user attempts to get around this by being clever, just… don’t fall for it.

But anyone who has spent time in security knows how this story usually ends. Humans get tricked and controls get bypassed. It’s not a question of if a model will be social engineered, but when.

That matters now because models are no longer toys. They summarize sensitive data, draft customer communications, influence decisions, and increasingly act on behalf of organizations. When a system like that can be persuaded through flattery or carefully staged context, the blast radius is operational, reputational, and sometimes legal.

This is why I follow this simple rule that shapes how I think about building with AI:

Text within this block will maintain its original spacing when published

Assume the model will be social engineered. Design so that it doesn’t matter.

This idea responsibility away from trying to prevent prompt injection and toward building more resilient systems. Instead of asking, “How do we stop people from tricking the model?” the better question becomes, “What happens if they succeed?”

Once you start there, your design choices become obvious. You stop giving models unilateral authority over irreversible actions. You log everything, you scope access tightly, so even a fully manipulated model can only see or do what is safe by default.

Any system that responds to human input is, by definition, open to influence, especially when it’s designed to reason, adapt, and behave in ways that feel human and therefore unpredictable. Designing with that inevitability in mind is how we move from fragile systems to ones that are genuinely durable.

If you’re building with AI today, the next step isn’t rewriting your prompts again. It’s stepping back and asking which parts of your system would still be safe if the model behaved in the most inconvenient way possible—and then designing from there.

What Hardening Atlas Reveals About the Real Risks of Prompt Injection

Munib Shah — Fri, 09 Jan 2026 03:21:37 GMT

I spent some time reading OpenAI’s write-up on how they’re hardening Atlas against prompt-injection attacks, and I found it useful for a simple reason: it’s honest about the shape of the problem.

Prompt injection is one of those issues that sounds abstract until you’ve actually tried to build or deploy an agent that does things on your behalf. The moment an AI is allowed to read untrusted text and take actions—clicking links, sending messages, filling forms—you’ve created a new attack surface. That isn’t a bug; it’s a structural property.

What I appreciated about the Atlas post is how clearly it treats this as an ongoing process. There’s no suggestion that prompt injection can be fixed once and set aside. Instead, OpenAI describes a continuous cycle of discovering new attack techniques, feeding those failures back into training, and steadily raising the bar for attackers. That framing feels right to me. It aligns closely with how we’ve historically dealt with spam, phishing, and social engineering.

One detail that stood out is their use of automated red-teaming powered by reinforcement learning. Rather than relying solely on humans to imagine attacks, they’re using models to probe other models for weaknesses at scale. That’s an AI-native approach to defense, and it fits the pace at which these attack patterns evolve.

Reading this didn’t leave me with the sense that prompt injection is “handled.” If anything, it reinforced the idea that as agents become more capable and more autonomous, prompt injection shifts from an edge case to a baseline risk you design around. You assume it will happen. You limit blast radius. You add friction before irreversible actions. And you accept that some percentage of attempts will succeed.

This reminds me of how we learned to live with email. We didn’t eliminate scams; we layered defenses, educated users, built filters, and kept iterating. Atlas feels like an early step along that same maturity curve for agentic AI.

I’m glad OpenAI published this. It doesn’t announce a breakthrough, but it does model the right mindset: treat AI agents as operating in a hostile environment, expect adversarial behavior by default, and invest in systems that improve through constant pressure rather than static guarantees.

That mindset may end up being the most important takeaway here—for OpenAI and for anyone building agents that interact with the real world.

Using the OWASP GenAI Security Project’s AI Threat Defense Compass to Deploy Microsoft Copilot Securely

Munib Shah — Thu, 08 Jan 2026 15:07:45 GMT

The AI frontier is packed with opportunity. Generative AI tools like Microsoft Copilot promise real gains in productivity, efficiency, and business value. At the same time, they introduce new and unfamiliar risks—some well understood, others still emerging.

The challenge for most organizations is simple to state but hard to execute:

How do we capture the benefits of AI while avoiding the ways it can cause harm?

That’s exactly the problem the OWASP GenAI Security Project’s AI Threat Defense Compass is designed to solve.

This post walks through how the Compass can be used as a practical, repeatable methodology for securely deploying :contentReference[oaicite:1]{index=1} in an enterprise environment.

What Is the AI Threat Defense Compass?

The AI Threat Defense Compass is part of the :contentReference[oaicite:2]{index=2} Generative AI Security Project. Its goal is to help organizations identify, prioritize, and act on AI-related cyber risks—without slowing innovation to a crawl.

It was created for:

CISOs and security leaders
Red teamers and threat modelers
Privacy and legal teams
Anyone responsible for deploying AI securely in an organization

Rather than reinventing guidance, the Compass operationalizes existing OWASP GenAI resources into something teams can actually use.

At its core, it answers one critical question:

“What is the worst thing I need to be prepared for?”

A Methodology Built on the OODA Loop

The Compass uses the OODA loop—Observe, Orient, Decide, Act—so security teams can move at the same speed as the AI frontier.

Observe
Identify the problem, the deployment context, and the threats you need to care about.
Orient
Gather intelligence: vulnerabilities, incidents, legal exposure, and unknowns you need to resolve.
Decide
Make informed, risk-based decisions grounded in business impact.
Act
Implement mitigations, defenses, and a delivery roadmap—then iterate.

This loop is intentionally iterative. You move quickly to a decision, act, reassess, and repeat as conditions change.

Integrating with Existing Security Processes

One of the strengths of the Compass is that it doesn’t exist in a vacuum. It aligns AI risk with familiar security frameworks and processes, including:

CPE, CVE, and CWE
MITRE ATT&CK and ATLAS
Existing threat and vulnerability management workflows

This makes it far easier to integrate AI risk into how your organization already operates.

AI Deployment Profiles

The Compass defines multiple deployment profiles, recognizing that not all AI risk looks the same:

External AI Threats
How adversaries may use AI against your organization.
Internal / Existing AI
AI already embedded in applications you’re using today.
Custom or Model-Building Projects
Risks specific to teams training or fine-tuning models.
Licensed Enterprise AI Tools
The focus of this example: deploying tools like Microsoft Copilot.

For Copilot, the organization is primarily a model user, not a model builder—an important distinction that affects both risk and remediation strategy.

Step 1: Start with the Playbook and Threat Profiles

Using the Compass begins with downloading:

The playbook
The Compass spreadsheet tool

Appendix A of the playbook contains threat profiles—a comprehensive checklist of AI-related concerns. You don’t tackle everything at once. Instead, you identify what matters most for your deployment and start building a priority list.

The point isn’t perfection. It’s momentum.

Step 2: Define Business Success First

Before diving into threats, the Compass forces an important discipline: define success in business terms.

For example:

Deploy Microsoft Copilot enterprise-wide
Improve productivity by 20%
Target $6M in annual value

These numbers matter. They allow you to balance business upside against security risk and potential impact—instead of treating security decisions in isolation.

Step 3: Understand the Deployment Context

Microsoft Copilot doesn’t have a single system card because it’s composed of multiple models. Instead, Microsoft provides equivalent transparency through:

Responsible AI documentation
Transparency notes
Product-specific FAQs

If individual models are identified, teams can still review their model cards directly.

Understanding whether you are a model deployer or model consumer is critical. While threats like model poisoning or weight theft may still exist, the likelihood, impact, and remediation cost differ significantly.

Step 4: Attack Surface Modeling (1–5 Scale)

The Compass uses lightweight attack surface modeling to answer a simple question:

Is this a five-alarm fire—or a one-alarm fire?

Threats are scored on a 1–5 scale, with definitions customized to your organization. Financial thresholds matter here. For some organizations, $1M is catastrophic; for others, $5M is the floor.

This step turns abstract AI threats into something executives can actually reason about.

Step 5: Define “Nuclear Disaster” Scenarios

Next, teams identify worst-case scenarios:

What is the single worst day this deployment could cause?
What would cleanup cost—financially, legally, reputationally?

By working backward from these scenarios, teams can design controls that prevent existential failures, not just minor issues.

Step 6: Orient on Vulnerabilities, Incidents, and Legal Risk

In the Orient phase, teams gather real-world evidence:

CVEs related to the deployment
Mapping to OWASP Top 10 and GenAI Top 15 risks
Incident data from AI incident databases
Financial impact examples
Litigation and regulatory exposure (for example, via university AI litigation databases)

This grounds AI risk discussions in actual outcomes, not speculation.

Step 7: Red Teaming and Testing

Before production rollout, the Compass emphasizes AI red teaming:

Test real attack paths
Identify failure modes
Assign severity using guidance from Bugcrowd or CVSS-style scoring
Normalize results into the same 1–5 risk scale

Some judgment is unavoidable—but structured judgment beats guesswork.

Step 8: Build the Act Strategy and Roadmap

With all inputs in place, teams move to action:

Define remediation strategies
Assign owners
Set timelines aligned with business deployment goals

The dashboard becomes a single source of truth:

Where you started
What’s in progress
What leadership should prioritize next

And then—you loop back to Observe and repeat.

Compass Is a Methodology, Not Just a Tool

The AI Threat Defense Compass is open source by design. Organizations are encouraged to:

Modify scoring models
Add rigor where needed
Adapt it to their culture and risk tolerance

It’s meant to help teams deploy AI quickly and safely, not slow them down.