Contextual Feeling

(or…The Third Axis: How AI Changes the Way We Think About Threat Detection)

Mar 30, 2026

(Before we get going, I’ll open with a Wikipedia style opening and say this is an article is about the Z Axis, or Spherical coordinate system. Not the other Axis (Disambiguation), your history book might discuss)

For years I’ve thought about threat detections on an XY axis. Speed on one side, accuracy on the other. The tradeoff between them is basically a a 1-slope or a 45-degree line — the faster a detection is to implement, the less accurate it tends to be, and vice versa. The goal has always been to find that sweet spot somewhere in the middle: not so fast that you’re drowning in noise, not so slow that you’re never confident in what you’re seeing.

That model held up for a long time. But I think AI has broken it — not by eliminating the tradeoff, but by adding a third dimension that the old model doesn’t account for.

The Original Two: Atomic and Machine Learning

On one end of the spectrum you have atomic detections. Rules based on a specific, known data point — an IP address, a file hash, a registry key, a domain. Sigma rules live here. Emerging threat feeds live here. The value of atomic detections is that they’re deterministic: the same input produces the same output every time. If you write a rule that fires when it sees a specific IOC, and that IOC shows up, the rule fires. No ambiguity.

The tradeoff is tuning. Writing an atomic detection is fast. Getting it accurate — low false positive rate, properly scoped, not firing on legitimate traffic — takes real time. And the threat landscape changes fast enough that rules require constant maintenance to stay relevant. At the bare minimum of any detection program, atomic detections are where you start. They’re the foundation. But they have a ceiling.

On the other end you have machine learning. ML flips the tradeoff. It’s slow to implement — you need historical data, a trained model, ongoing monitoring for accuracy, drift detection over time — but once it’s running it can catch things that no rule would ever catch. Behavioral anomalies, subtle pattern shifts, activity that doesn’t match any known IOC but doesn’t look right either. The catch is that ML models are probabilistic. They’ll tell you something looks suspicious, and they’ll tell you how confident they are, but they won’t tell you definitively that it’s bad. Sometimes they’ll tell you they’re not very confident — and honestly, when they say that, they’re usually right to flag the uncertainty.

So for a long time the detection strategy question was: where on that spectrum do you want to live? Rules or models? Fast and deterministic, or slow and probabilistic? Or my favorite…

AI Adds a Third Axis

Here’s where I think the framing needs to change. When people ask where AI fits on that spectrum, they’re still thinking in two dimensions. And it doesn’t fit cleanly.

AI — specifically LLMs and agents — introduces a third variable: determinism.

Atomic detections are fully deterministic. Machine learning is probabilistic but bounded — the same model on the same data will give you a consistent confidence score. LLMs are non-deterministic by nature. The same prompt, given the same context, can produce different outputs. They can hallucinate. They can go down a reasoning path you didn’t intend. They can be confidently wrong. That’s not a flaw to work around — it’s a fundamental characteristic of the technology that has to be part of how you use it.

If speed and accuracy are the X and Y axes, determinism is the Z axis. And when you look at it that way, atomic detections, ML models, and AI each occupy a different point in three-dimensional space rather than sitting on the same line.

That reframing matters because it changes how you think about where each one belongs in your stack.

The Right Job for Each

There’s an old saying: quick, cheap, or easy — pick two. Detection engineering strategy has always been a version of that problem. The third axis doesn’t solve it. It just makes the tradeoffs more explicit.

Atomic detections are deterministic, fast to deploy, and expensive to maintain at scale; ML models are probabilistic, slow to build, and better at catching what rules miss. AI is non-deterministic, fast at certain tasks, and genuinely bad at others.

The question isn’t which one wins. It’s which one is right for which job.

And I think the industry is currently overindexing on using AI as a detector and underinvesting in using AI as an operator.

Everyone has an agent that “monitors your network” or “finds threats” or “reasons over your alerts.” And some of that is real. But an agent is a feature, not a strategy. Buying a tool that uses AI to detect things doesn’t answer the harder question of what your detection strategy actually is.

Why the Layered Approach Works

In most security conversations cough RSA cough I think there’s a point that often gets missed: AI doesn’t work well in a vacuum. It works well when it has context.

Anthropic published a post on how their own teams use Claude Code internally. They noted that moving to test-driven development — writing tests before writing code — produced more reliable, testable output from Claude. The insight isn’t just that TDD is good engineering hygiene. It’s that giving the AI a defined structure to work within, with clear constraints and expected outcomes, makes the output meaningfully better.

Using AI for detection engineering works the same way. When you ask an AI agent to “find the bad stuff in our network,” you’re giving it almost no structure to work within. When you ask it to write a Sigma rule for a specific TTP, validate it against known-good traffic, and map it to an ATT&CK technique — you’ve given it a test. You’ve given it constraints. You’ve given it a definition of correct. The output is better because the problem is better defined.

The atomic detection layer and the ML layer aren’t just detection mechanisms. They’re also the context that makes AI effective when you bring it in. Your existing Sigma rules tell AI what you already know how to detect. Your ML models tell AI what behavioral baselines look like. Together they give AI the constraints it needs to do useful work — whether that’s filling coverage gaps, tuning thresholds, or reasoning over anomalies that don’t match any existing rule.

Google Research released a paper called MLE-STAR, a state-of-the-art machine learning engineering agent capable of automating various machine learning tasks. The whole premise is that getting better results from ML isn’t about replacing the model — it’s about giving it better inputs, better structure, and better feedback loops. AI can now help do that work. Which means the ML layer you already have can get materially better without starting over.

Where AI Actually Changes Things

The most underrated use of AI in detection right now isn’t finding threats. It’s keeping your detection program running.

Think about what actually breaks down in most detection programs over time. Sigma rules go stale because no one has time to rewrite them when a TTP evolves. ML models drift because the data they were trained on no longer reflects current attacker behavior. The ATT&CK matrix gets updated and the coverage gaps don’t get addressed because mapping detections to TTPs manually is slow and nobody’s priority.

AI is genuinely good at all of those maintenance tasks. Writing a Sigma rule from a threat report is something an LLM can do well and fast. Identifying which existing rules are likely to need updates based on new threat intelligence is something an LLM can reason over. As mentioned, Google has published research on using AI to tune machine learning models — adjusting hyperparameters, identifying drift, improving precision — that points at a future where the model-building loop is itself partially automated.

The non-determinism that makes AI unreliable as a standalone detector becomes much more manageable when a human is reviewing the output before it goes into production. A Sigma rule that an LLM drafts and a human reviews is better than a Sigma rule nobody had time to write. A tuning recommendation from an AI assistant that an engineer validates is better than a model running on stale parameters because nobody got to it.

The way I’m starting to think about this: AI’s role in detection isn’t to replace the atomic layer or the ML layer. It’s to be the engine that keeps both of them current, calibrated, and aligned to the actual threat landscape.

The Framework in Practice

If you think about a mature detection program through this lens, you end up with something like:

Atomic detections for what you know. Fast, deterministic, high-confidence when tuned. AI writes and maintains them — turning threat reports into rules, flagging stale coverage, mapping new TTPs to existing logic.

Machine learning for what you don’t know yet. Probabilistic, behavioral, better at catching novel activity. AI helps tune it — adjusting models, monitoring drift, identifying where the training data needs to be refreshed.

AI agents for what requires reasoning over context. Correlating signals across sources, generating hypotheses, helping analysts work through ambiguous alerts. Used with appropriate skepticism about confidence and with human review in the loop.

None of these replace the others. They cover different parts of the problem, and AI shows up differently in each one.

My Lukewarm AI Take

I’m optimistic about what AI can do for threat detection programs. I’m much more skeptical about how we’re talking about it right now — as if deploying an AI agent is itself a detection strategy.

The teams that will get the most out of AI aren’t the ones that replace their detection program with it. They’re the ones that use it to do the unglamorous work that detection programs actually need: keeping rules current, keeping models calibrated, keeping coverage aligned to what adversaries are actually doing.

That’s not as exciting a demo as an agent that finds threats. But it’s the thing that actually makes detection programs better over time.

Engineering Chaos is about applying modern data engineering to rethink how security teams build and operate their data infrastructure. As always, views are mine. You can share them but I don’t know why you’d want to.

Engineering Chaos

Discussion about this post

Ready for more?