This blog is perhaps a little bit more like an ad, so if you don’t want to check the ads, consider not reading it.
a very cyber image (Gemini)
But this year at RSA 2026, I’m speaking on three topics: securing AI, using AI for SOC, and sharing lessons about how Google applies AI and other technologies to D&R.
Here are these 3 fun things!
First, I’m doing a presentation on governing shadow AI agents. Believe it or not, this presentation was created mostly before OpenClaw became a thing (but updated for it!). So you may be surprised how well the content aged (think wine!) Attend this if you are struggling with shadow AI, specifically shadow agents at work.
It is not the APT! The new threat is the “shadow AI agents” employees already use for work, leaking data and making decisions. Banning them is a losing game. This session will offer a better way: turn this organic behavior into a catalyst for secure progress. Learn to discover, assess, and channel unsanctioned agents into a formal strategy that empowers a team rather than force it underground.
The second is probably the most detailed discussion about how we use AI for detection and response at Google. You probably read our blogs and listen to our talks (especially this), but this time we are revealing a lot more interesting details about the machinery and also how we arrived at the state we’re in. I promise you this will be fun! And detailed too.
Presenters will share the playbook for building and scaling AI agents in cybersecurity. Attendees will learn four core lessons: Building trust with the team, prioritizing real problems, measuring value, and establishing solid governance foundations for the agentic SOC.
Finally, the third isn’t a presentation but a discussion that would help you understand the real state of AI in security operations / SOC. This would not be about the slides, but about sharing lessons on what works and what doesn’t.
Attendees in this peer-led discussion will share stories from the AI-powered SOC trenches. Explore real adoption journeys from manual processes to autonomous agents. Share practical use cases on analyst retraining, workflow auditing, malware analysis, remediation automation, RAG pipelines and more. Trade notes on what’s working, what’s breaking, trust gaps, AI hallucinations, and career redesign.
All in all, join me for securing AI and Shadow Agents, learning from Google about detection and response, and comparing the state of practice of AI in the SOC.
Let’s go through all 5 pillars aka readiness dimensions and see what we can actually do to make your SOC AI-ready.
#1 SOC Data Foundations
As I said before, this one is my absolute favorite and is at the center of most “AI in SOC” (as you recall, I want AI in my SOC, but I dislike the “AI SOC” concept) successes (if done well) and failures (if not done at all).
Reminder: pillar #1 is “security context and data are available and can be queried by machines (API, Model Context Protocol (MCP), etc) in a scalable and reliable manner.” Put simply, for the AI to work for you, it needs your data. As our friends say here, “Context engineering focuses on what information the AI has available. […] For security operations, this distinction is critical. Get the context wrong, and even the most sophisticated model will arrive at inaccurate conclusions.”
Readiness check: Security context and data are available and can be queried by machines in a scalable and reliable manner. This is very easy to check, yet not easy to achieve for many types of data.
For example, “give AI access to past incidents” is very easy in theory (“ah, just give it old tickets”) yet often very hard in reality (“what tickets?” “aren’t some too sensitive?”, “wait…this ticket didn’t record what happened afterwards and it totally changed the outcome”, “well, these tickets are in another system”, etc, etc)
Steps to get ready:
Conduct an “API or Die” data access audit to inventory critical data sources (telemetry and context) and stress-test their APIs (or other access methods) under load to ensure they can handle frequent queries from an AI agent. This is important enough to be a Part 3 blog after this one…
Establish or refine unified, intentional data pipelines for the data you need. This may be your SIEM, this may be a separate security pipeline tool, this may be magick for all I care … but it needs to exist. I met people who use AI to parse human analyst screen videos to understand how humans access legacy data sources, and this is very cool, but perhaps not what you want in prod.
Revamp case management to force structured data entry (e.g., categorized root causes, tagged MITRE ATT&CK techniques) instead of relying on garbled unstructured text descriptions, which provides clean training data for future AI learning. And, yes, if you have to ask: modern gen AI can understand your garbled stream of consciousness ticket description…. but what it makes of it, you will never know…
Where you arrive: your AI component, AI-powered tool or AI agent can get the data it needs nearly every time. The cases where it cannot become visible, and obvious immediately.
#2 SOC Process Framework and Maturity
Reminder: pillar #2 is “Common SOC workflows do NOT rely on human-to-human communication are essential for AI success.” As somebody called it, you need “machine-intelligible processes.”
Readiness check: SOC workflows are defined as machine-intelligible processes that can be queried programmatically, and explicit, structured handoff criteria are established for all Human-in-the-Loop (HITL) processes, clearly delineating what is handled by the agent versus the person. Examples for handoff to human may include high decision uncertainty, lack of context to make a call (see pillar #1), extra-sensitive systems, etc.
Common investigation and response workflows do not rely on ad-hoc, human-to-human communication or “tribal knowledge,” such knowledge is discovered and brought to surface.
Steps to get ready:
Codify the “Tribal Knowledge” into APIs: Stop burying your detection logic in dusty PDFs or inside the heads of your senior analysts. You must document workflows in a structured, machine-readable format that an AI can actually query. If your context — like CMDB or asset inventory — isn’t accessible via API (BTW MCP is not magic!), your AI is essentially flying blind.
Draw a Hard Line Between Agent and Human: Don’t let the AI “guess” its level of authority. Explicitly delegate the high-volume drudgery (log summarization, initial enrichment, IP correlation) to the agent, while keeping high-stakes “kill switches” (like shutting down production servers) firmly in human hands.
Implement a “Grading” System for Continuous Learning: AI shouldn’t just execute tasks; it needs to go to school. Establish a feedback loop where humans actively “grade” the AI’s triage logic based on historical resolution data. This transforms the system from a static script into a living “recipe” that refines itself over time.
Target Processes for AI-Driven Automation: Stop trying to “AI all the things.” Identify specific investigation workflows that are candidates for automation and use your historical alert triage data as a training ground to ensure the agent actually learns what “good” looks like.
Where you arrive: The “tribal knowledge” that previously drove your SOC is recorded for machine-readable workflows. Explicit, structured handoff points are established for all Human-in-the-Loop processes, and the system uses human grading to continuously refine its logic and improve its ‘recipe’ over time. This does not mean that everything is rigid; “Visio diagram or death” SOC should stay in the 1990s. Recorded and explicit beats rigid and unchanging.
#3 SOC Human Element and Skills
Reminder: pillar #3 is “Cultivating a culture of augmentation, redefining analyst roles, providing training for human-AI collaboration, and embracing a leadership mindset that accepts probabilistic outcomes.”You say “fluffy management crap”? Well, I say “ignore this and your SOC is dead.”
Readiness check: Leaders have secured formal CISO sign-off on a quantified “AI Error Budget,” defining an acceptable, measured, probabilistic error rate for autonomously closed alerts (that is definitely not zero, BTW). The team is evolving to actively review, grade, and edit AI-generated logic and detection output.
Steps to get ready:
Implement the “AI Error Budget”: Stop pretending AI will be 100% accurate. You must secure formal CISO sign-off on a quantified “AI Error Budget” — a predefined threshold for acceptable mistakes. If an agent automates 1,000 hours of labor but has a 5% error rate, the leadership needs to acknowledge that trade-off upfront. It’s better to define “allowable failure” now than to explain a hallucination during an incident post-mortem.
Pivot from “Robot Work” to Agent Shepherding: The traditional L1/L2 analyst role is effectively dead; long live the “Agent Supervisor.” Instead of manually sifting through logs — work that is essentially “robot work” anyway — your team must be trained to review, grade, and edit AI-generated logic. They are no longer just consumers of alerts; they are the “Editors-in-Chief” of the SOC’s intelligence.
Rebuild the SOC Org Chart and RACI: Adding AI isn’t a “plug and play” software update; it’s an organizational redesign. You need to redefine roles: Detection Engineers become AI Logic Editors, and analysts become Supervisors. Most importantly, your RACI must clearly answer the uncomfortable question: If the AI misses a breach, is the accountability with the person who trained the model or the person who supervised the output?
Where you arrive: well, you arrive at a practical realization that you have “AI in SOC” (and not AI SOC). The tools augment people (and in some cases, do the work end to end too). No pro- (“AI SOC means all humans can go home”) or contra-AI (“it makes mistakes and this means we cannot use it”) crazies nearby.
#4 Modern SOC Technology Stack
Reminder: pillar #4 is “Modern SOC Technology Stack.” If your tools lack APIs, take them and go back to the 1990s from whence you came! Destroy your time machine when you arrive, don’t come back to 2026!
Readiness check: The security stack is modern, fast (“no multi-hour data queries”) interoperable and supports new AI capabilities to integrate seamlessly, tools can communicate without a human acting as a manual bridge and can handle agentic AI request volumes.
Steps to get ready:
Mandate “Detection-as-Code” (DaC): This is no longer optional. To make your stack machine-readable, you must implement version control (Git), CI/CD pipelines, and automated testing for all detections. If your detection logic isn’t codified, your AI agent has nothing to interact with except a brittle GUI — and that is a recipe for failure.
Find Your “Interoperability Ceiling” via Stress Testing: Before you go live, simulate reality. Have an agent attempt to enrich 50 alerts simultaneously to see where the pipes burst. Does your SOAR tool hit a rate limit? Does your threat intel provider cut you off? You need to find the breaking point of your tech stack’s interoperability before an actual incident does it for you.
Decouple “Native” from “Custom” Agents: Don’t reinvent the wheel, but don’t expect a vendor’s “native” agent to understand your weird, proprietary legacy systems. Define a clear strategy: use native agents for standard tool-specific tasks, and reserve your engineering resources for custom agents designed to navigate your unique compliance requirements and internal “secret sauce.”
Where you arrive: this sounds like a perfect quote from Captain Obvious but you arrive at the SOC powered by tools that work with automation, and not with “human bridge” or “swivel chair.”
#5 SOC Metrics and Feedback Loop
Reminder: pillar #5 is “You are ready for AI if you can, after adding AI, answer the “what got better?” question. You need metrics and a feedback loop to get better.”
Readiness check: Hard baseline metrics (MTTR, MTTD, false positive rates) are established before AI deployment, and the team has a way to quantify the value and improvements resulting from AI. When things get better, you will know it.
Steps to get ready:
Establish the “Before” Baseline and Fix the Data Slop: You cannot claim victory if you don’t know where the goalposts were to begin with. Measure your current MTTR and MTTD rigorously before the first agent is deployed. Simultaneously, force your analysts to stop treating case notes like a private diary. Standardize on structured data entry — categorized root causes and MITRE tags — so the machine has “clean fuel” to learn from rather than a collection of “fixed it” or “closed” comments.
Build an “AI Gym” Using Your “Golden Set”: Do not throw your agents into the deep end of live production traffic on day one. Curate a “Golden Set” of your 50–100 most exemplary past incidents — the ones with flawless notes, clean data, and correct conclusions. This serves as your benchmark; if the AI can’t solve these “solved” problems correctly, it has no business touching your live environment.
Adopt Agent-Specific KPIs for Performance Management: Traditional SOC metrics like “number of alerts closed” are insufficient for an AI-augmented team. You need to track Agent Accuracy Rate, Agent Time Savings, and Agent Uptime as religiously as you track patch latency. If your agent is hallucinating 5% of its summaries, that needs to be a visible red flag on your dashboard, not a surprise you discover during an incident post-mortem.
Close the Loop with Continuous Tuning: Ensure triage results aren’t just filed away to die in an archive. Establish a feedback loop where the results of both human and AI investigations are automatically routed back to tune the underlying detection rules. This transforms your SOC from a static “filter” into a learning system that evolves with every alert.
Where you arrive: you have a fact-based visual that shows your SOC becoming better in ways important to your mission after you add AI (in fact, you SOC will get better even before AI but after you do the prep-work from this document)
As a result, we can hopefully get to this instead:
Better introduction of AI into SOC
The path to an AI-ready SOC isn’t paved with new tools; it’s paved with better data, cleaner processes, and a fundamental shift in how we think about human-machine collaboration. If you ignore these pillars, your AI journey will be a series of expensive lessons in why “magic” isn’t a strategy.
But if you get these right? You move from a SOC that is constantly drowning in alerts to a SOC that operates truly 10X effectiveness.
Random cool visual because Nano Banana :)
P.S. Anton, you said “10X”, so how does this relate to ASO and “engineering-led” D&R? I am glad you asked. The five pillars we outlined are not just steps for AI; they are the also steps on the road to ASO (see original 2021 paper which is still “the future” for many).
ASO is the vision for a 10X transformation of the SOC, driven by an adaptive, agile, and highly automated approach to threats. The focus on codified, machine-intelligible workflows, a modern stack supporting Detection-as-Code, and reskilling analysts as “Agent Supervisors” directly supports the core of engineering-led D&R. So focusing on these five readiness dimensions, you move from a traditional operations room (lots of “O” for operations) to a scalable, engineering-centric D&R function (where “E” for engineering dominates).
So, which pillar is your SOC’s current ‘weakest link’? Let’s discuss in the comments and on socials!
Five years. It’s enough time to fully launch a cloud migration, deploy a new SIEM, or — if you’re a very large enterprise — just start thinking about doing the first two. It’s also how long Tim and I have been subjecting the world to our thoughts on Cloud Security Podcast by Google.
We finally got around to writing the annual “reflections blog.” And, honestly, looking back at Season 5, the state of the industry feels a lot like a chaotic Cybersecurity Garage Sale.
We’re all standing knee-deep in a pile of dusty, obsolete junk — the mid-2000s SIEMs, the 1990s unauthenticated vulnerability scans — while clutching shiny, still-in-the-box AI Agent gadgets we don’t quite know where to put. It’s a mess. But within this mess, a few essential, high-value items have emerged.
So, to all our listeners — the veterans and the newcomers — thank you for sorting through the chaos with us. For Season 6, we’re going all video, by default (opening January 5, 2026). Find us on our new YouTube home: Cloud Security Podcast by Google on YouTube.
Below you will find 3 fun sections: Anton’s faves, Tim’s faves and top 10 by listens (“data’s faves” of sorts, or perhaps listener faves)
Enjoy!
Anton: My selections are, perhaps, a bit predictable — but they were immense fun to record and, I believe, are absolutely essential listening! But, hey, I am biased a bit!
EP236 Accelerated SIEM Journey: A SOC Leader’s Playbook for Modernization and AI This fun episode provides a playbook for SOC leaders on accelerating their SIEM modernization journey. We go into the steps the bank took for moving beyond legacy systems, focusing on how to integrate AI for transformative results and build a truly modern Security Operations Center.
EP254 Escaping 1990s Vulnerability Management: From Unauthenticated Scans to AI-Driven Mitigation This essential episode with Caleb Hoch tackles the “fractions of a century” time lag in vulnerability management, moving beyond endless unauthenticated scans. We discuss how to establish a Gold Standard prioritization model and why running VM Tabletop Exercises is the vital, transformative practice needed for true modernization.
EP223 AI Addressable, Not AI Solvable: Reflections from RSA 2025 The single most important lesson from RSA 2025 was captured in this episode: AI is merely “Addressable, Not Solvable.” We cut through the hype to discuss where AI can deliver real, practical security value, and where we still need our smart human colleagues to lead the way. This is essential listening for anyone trying to navigate the flood of vendor claims.
EP242 The AI SOC: Is This The Automation We’ve Been Waiting For? This epic episode tackles the most pressing question for security operations: Can “AI SOC” deliver the transformative automation we’ve been waiting for? We discuss — with Anton’s former colleague — the real-world applications of AI in the SOC, focusing on practical gains (and how to know you “gained” anything) and what it means for the future role of the human analyst.
EP238 Google Lessons for Using AI Agents for Securing Our Enterprise This fun episode brings you practical lessons from Google’s own experience using AI agents to secure our enterprise at scale (see this blog also). We dive (not “delve”, mind you!) deep into the real-world application of this technology, focusing on the wins, the challenges, and what it took to adopt. This is essential listening for any leader looking to leverage AI agents effectively without falling into the hype cesspool.
BONUS: EP237 Making Security Personal at the Speed and Scale of TikTok This unique episode goes into what it takes to secure a hyper-scale, global platform like TikTok. We discuss how to move beyond legacy compliance while living in a modern microservices architecture, balance a consistent global security posture with localized regulatory demands, and, most importantly, empower every user with practical tips (like 2FA and strong passphrases) to make security personal.
Tim: My picks are almost entirely not overlapping with Anton, we started our lists separately, but then realized that we scooped each other on two episodes. We both liked our episode with Manija Poulatova enough to keep her on both of our lists!
EP256 Rewiring Democracy & Hacking Trust: Bruce Schneier on the AI Offense-Defense Balance This episode is a total delight for both of us. For me, I got to not only meet one of my security heroes, I got to see Anton do the same! We named Bruce in our early planning docs as somebody we’d like to have on the show someday when we’re all grown up. Not a bad way to wrap up five years of weekly podcasting!
EP236 Accelerated SIEM Journey: A SOC Leader’s Playbook for Modernization and AI Manija and I were on a panel together in Las Vegas during Google Cloud Next 2025. A few themes from that panel came through in our episode together that I love and think are vital for anyone. First, aim for transformation not migration. As an industry we are not doing so well compared to air transport safety. We cannot cling to our old ways and hope for a better set of outcomes. Second, AI is here to enable our human colleagues, not replace them. We can find greater meaning, joy, and productivity in our work, even as SOC analysts, once we embrace what AI can automate for us.
EP239 Linux Security: The Detection and Response Disconnect and Where Is My Agentless EDR Craig was introduced to me by Friend Of The Show (and friend of mine!) Vijay Ganti (EP196) as someone building an innovative approach to EDR security. Scheduling this episode ended up a little tricky, and I got to do an episode without Anton. That ended up ok, because in Craig I found a totally kindred spirit. We’ve both built systems to secure Linux without agents, though from two different approaches. His stories of finding badness in places we couldn’t previously look, and doing so scalably even for phone towers up the hill behind his house, really resonated with the part of me that spent four years building out Virtual Machine Threat Detection here at Google Cloud. This is definitely an episode for listeners who like to question conventional security thinking.
EP252 The Agentic SOC Reality: Governing AI Agents, Data Fidelity, and Measuring Success Another fun origin story: this episode was conceived in a karaoke booth in Singapore. Alex and Lars are two of our early design partners for the SecOps Triage Agent and their feedback to the team and on this episode is super valuable. Alex gets bonus points on this episode for using the word squelch which I’ve been pushing internally as a metaphor for our noise control systems. This is a must-listen for anyone interested in real AI adoption in their SOC. If Alex and Lars can do it across an unbelievable number of regulatory jurisdictions, you can too!
BONUS: EP232 The Human Element of Privacy: Protecting High-Risk Targets and Designing Systems I get one bonus episode for our top ten, so I’m going to include my classmate Sarah Aoun. She is an amazing Googler and on this episode she offers advice that’s useful almost universally, but especially if you believe that you’re a person who is at risk of being targeted online. This is firmly outside of our “cloud security” wheelhouse, but well worth a listen to understand threat modeling and security response for individuals.
Top 10 episodes by listens (excluding the oldest 3)
… so I went around and asked a whole bunch of AIs and agents and such. Then massaged and aggregated the outputs, then ran more AI on the result. And then lightly curated it. Then deleted the bottom 2 stupidest points they made.
So, here it comes … in all its sloppy glory!
The Foundational Roots and Unchanging Mission: Our show started with foundational cloud security topics — like Zero Trust, Data Security, and Cloud Migration Security which drew the initial large audiences. The core commitment since Episode 1 has been to question conventional wisdom, avoid “security theater” (EP248) and explore whether security measures truly benefit the user and the organization.
The AI Transformation: We had a sizable shift with the last 50 episodes, where AI became a central theme, or at least one of the themes we always come back to (and, yes, this covers our 3 pillars of securing AI, AI for security and countering the AI-armed attacker). The focus has moved past general hype to practical applications, securing AI systems, and asking challenging questions like “Data readiness for AI SOC” (EP249).
The Enduring Popularity of Detection & Response (D&R): We highlight that D&R and modernizing the SOC continue to be extremely popular with the audience (EP236 is epic). Trace the evolution of this topic from foundational engineering (like the very popular EP75 on scaling D&R at Google) to the architectural questions in EP250.
“How Google Does Security” Sells the Tickets: We love the episodes offering a candid look behind Google’s security curtain on topics like internal red teaming, detection scaling, and Cloud IR tabletops. They consistently remain perennial audience favorites (the latest in this series is EP238 on how we use AI agents for security).
The Centrality of People and Process: We emphasize the recurring lessons that the most challenging aspects of large-scale cloud (and now AI) security transformations are often the “people” and “process” elements, not the technical “tech” itself. EP237 is an epic example of this.
The Call for Intentionality: We reinforce the importance of having a clear purpose for every security activity and following an engineering-led approach (EP117). The “magical” advice from EP236 is: to ask of every security element, “what is it in service of?”
The Persistence of Old Problems: We often lament with a touch of humor on the industry’s tendency to repeat fundamental security mistakes (the SIEM Paradox in EP234 for instance or EP223 in general), underscoring the ongoing need to cover “boring” basics. We will absolutely continue this (a new episode on vulnerability management “stale” problems is coming soon)
Community and Format Growth: We continue to “sorta-kinda” (human wrote this, eh?) the development of the podcast beyond a purely audio medium, including the launch of live video sessions and a Community site to foster more dialogue and feedback.
The Unique Culture and Authenticity of the ShowStays: We remain obsessed about selecting high-energy, vocal, and knowledgeable guests and fun topics. We will keep on with our “inside jokes” like not allowing guests to recommend Anton’s blog as an episode resource and pokes about firewall appliances in the cloud (they are there).
A Glimpse at 300: We want to tease future topics that will define the next 50+ episodes, such as deeper dives into Agentic AI, challenges of cross-cloud incident response and forensics, or the geopolitical aspects of cloud security. Give us ideas, will ya? Otherwise, you will get to hear about AI and D&R much of the time…
In early 1900s, factory owners bolted the new electric dynamo onto their old, central-shaft-and-pulley systems. They thought they were modernizing, but they were just doing a “retrofit.” The massive productivity boom didn’t arrive until they completely re-architected the factory around the new unit-drive motor (metaphor source).
Today’s AI agent slapped onto a broken, 1990s-style SOC process stack is the same. Everyone is chasing the shiniest LLM or agentic system to “AI-enable” their existing, often sclerotic, processes. The result is an AI retrofit that instantly slams into deeper, systemic bottlenecks.
So, how to tell if your SOC is AI ready?
Five Pillars of MODEL OF AI-ready SOC:
SOC Data Foundations.
SOC Process Framework and Maturity
SOC Human Element and Skills
Modern SOC Technology Stack
SOC Metrics and Feedback Loop
Now, the details:
#1 SOC Data Foundations. Security context (why context?) and data are available and can be queried by machines (API, MCP, etc) in a scalable and reliable manner (Both! If unreliable, humans will need to fix it and the project dies). Scalable, fast and reliable matter, as agents can screen scrape well, but you probably won’t use this to get a gig of mainframe logs via tn3270. Federated often also means “not scalable and reliable” BTW. Because access to cheap/slow storage is, well, slow.
Of course, while availability and reliability are crucial, “AI ready SOC” also means data quality, structure, and governance. GIGO is still law! Scalability is necessary, but the quality of the ingested security context is the difference between this AI thing working … or not.
Questions to ask yourself:
Can all security telemetry and context data (logs, asset inventory, user context, etc) be quickly and reliably queried at scale?
Do we have formal data quality, structure, and governance processes in place to prevent “Garbage In, Garbage Out” from sabotaging our AI efforts?
#2 SOC Process Framework and Maturity. Common SOC workflows do NOT rely on human-to-human communication (“nobody knows what server4 does, let’s see if John knows, well, he does not, but he suggested Joanna does, and — WIN! — she really does” workflows are not agent-friendly) are essential for AI success. If your SOC has a lot of ad hoc activities, agents will (at least initially!) have trouble. Worse news: weak process (this pillar #2) is very often a close friend of weak data access (pillar #1) so they “double-team” your agentic effort to oblivion. This has sunk plenty of SOAR projects in its time.
Ultimately, “If your teams don’t know who owns what, neither will your Agents.” (source). Your SOC processes must be documented, validated, and capable of being scaled and learned from (see pillar #5 below). This includes a way to train AI on past work and SOC history.
Questions to ask yourself:
Can our most common investigation and response workflows be followed by an agent based purely on documentation, without the need for ad-hoc human-to-human queries?
Do we have a system in place to train AI agents on our past work/history of alert triage and resolution to enable learning and continuous process improvement?
#3 SOC Human Element and Skills: Cultivating a culture of augmentation, redefining analyst roles, providing training for human-AI collaboration, and embracing a leadership mindset that accepts probabilistic outcomes. You really, really need executives who support “augmented” AI SOC vision, not those who seek to “kill off” the humans.
Also, they should accept that machines will make mistakes, and that is OK. In fact, leaders must not just accept “probabilistic outcomes,” but explicitly be comfortable with the machine resolving some alerts, even if it’s sometimes wrong. This acceptance of necessary imperfection is a core readiness indicator. If they expect perfection, you will have AI SOC for a month. And then go back to printing logs and reviewing them with sad little human eyes :-)
Questions to ask yourself:
Are our leaders explicitly comfortable with the AI / machine autonomously closing alerts, even if it introduces an acceptable, measured error rate?
Have we redefined our analysts’ roles and provided training to shift their focus from manual alert triage to creative problem-solving and AI ‘shepherding’?
#4 Modern SOC Technology Stack: Implementing integrated and interoperable technologies that support intelligent systems and embed AI into existing workflows. This one is least critical of my pillar batch, but still it matters. Also, it is often a dependency for #1, so this matters as well.
The criticism is that a single “AI tool” is not the goal. The technology stack must ensure the entire security ecosystem is interoperable and flexible enough to support the other pillars. This means you can remediate, mitigate, etc.
Questions to ask yourself:
Is our security stack interoperable and flexible enough to allow new AI capabilities to integrate seamlessly, or are we reliant on siloed, single-function security tools?
Will any of our tools be overrun with agentic AI request volumes?
#5 Metrics and Feedback Loop: You are ready for AI if you can, after adding AI, answer the “what got better?” question. You need metrics and a feedback loop to get better. And to know you got better. If you “add AI” to a bad, old SOC, not only you won’t get better, you won’t even know you didn’t get better.
Metrics are a must here. Without a defined way to measure value and feed the results back into the AI models and processes, the transformation risks stalling into at best a “retrofit”, a nothing or even a worse situation…
Questions to ask yourself:
Can we quantify the value of AI by measuring the improvements that resulted from it?
Do we have an automated, continuous feedback loop to ensure AI model decisions and performance metrics are fed back into process documentation and model retraining?
The pillars should be framed not just as pre-requisites for AI adoption, but as the building blocks for a completely re-architected Security Operations Center. The transformation is about reimagining the whole way of doing things, not just accelerating one piece of an old process.
When I joined Chronicle in the summer of 2019 — a name now rolled into the broader Google SecOps product (with SOAR by Siemplify and threat intel by Mandiant) — it was very much a startup. Yes, we were part of Alphabet, but the spirit, the frantic energy, the drive — it was a startup to its core.
And here’s the kicker (and a side rant!): I’m fundamentally allergic to large companies. Those who know me have heard me utter this countless times. So, in a matter of weeks after joining a small company, I found myself working for a very large one indeed.
To me, that pivot, that blending of startup momentum and big company scale, is, in many ways, the secret sauce behind our success today. It turns out, you need both the wild ambition of a young vendor and the solid foundation of a massive enterprise to truly move the needle (and the dots on the MQ … but these usually reflect customer realities).
The MQ and the Price of Poker
Now, as a reformed analyst who spent eight years in the Gartner trenches, I’ll clear up a misconception right away: the Magic Quadrant placement has precisely zero to do with how much a vendor pays Gartner. Trust me, there are vendors in highly visible SIEM MQ positions who’ve probably never sent Gartner a dime over the years.
Conversely, there are large organizations that have paid a fortune and have been completely excluded from the report. The MQ placement reflects customer traction and market reality (usually — there are sad yet very rare exceptions to this, and I will NOT talk about them; there is not enough whiskey in the world to make me). MQ placement is a measure of genuine success, not a destination achieved by writing a big check.
The Evolution of SIEM: Where Did the Brothers Go?
Reflecting on the last few years in SIEM (not 20 years!) and looking at the current MQ, a few things that were once controversial are now conventional wisdom:
SIEM must be SaaS and Cloud-Native. I’m old enough to remember when the idea of trusting your security data to the cloud was an existential debate. Today, with the relentless attack surface expansion, perhaps more people are realizing that the biggest risk is actually running a vulnerable, constantly-compromised on-prem SIEM stack. Data gravity shifted.
SIEM and SOAR are fully merged. They are, in essence, two inseparable brothers forming the core of modern SIEM — detection and response. SIEM is really SIEM/SOAR in 2025. Standalone SOAR vendors do exist and some “AI SOC” vendors are really “SOAR 3.0”, but these are — IMHO — outliers compared to the mainstream SIEM.
The UEBA brother got absorbed, but … Remember the mid-2010s, when User and Entity Behavior Analytics (UEBA) was the new shiny toy, all driven by cool machine learning? While it was an equal brother to SOAR for a moment, it has now largely been absorbed into the detection stack of the main SIEM product. Machine learning’s importance for basic threat detection has subtly decreased (odd…isn’t it?). UEBA has become a single, albeit important, feature within the engine, not a standalone platform.
Some XDR vendors graduated to real SIEM. EDR-centric SIEM vendors (XDR, if you have to go there), have landed. IMHO, these guys will do some heavy damage in the market in the next 1–2 years.
The Most Powerful Force in the Universe: IT Inertia
When I left Gartner, I famously outlined one key lesson from my analyst time: IT inertia is the most powerful force in the universe.
When you look at the MQ, you might see what looks like “same old, same old,” with certain large, established vendors still floating around. This is NOT about who pays, really! You might not believe it, but this placement absolutely reflects enterprise reality. Large vendors don’t die immediately.
Case in point: it took one particularly prominent legacy SIEM vendor (OK, I will name this one as it is finally dead for real, ArcSight) almost ten years to truly disappear from the minds of practitioners. Most companies were abandoning that technology around 2017–2018), but the vendor only truly died off in the market narrative in 2025. The installed base hangs on, dragging the demise out over a decade.
AI, Agents, and the Missing Tsunami
Finally, a quick note on the current darling: Generative AI and AI Agents.
While some vendors (and observers) expected a massive, dramatic impact from Generative AI on this year’s MQ, it simply hasn’t materialized — yet. As other Gartner papers will tell you, AI does not drive SIEM purchasing behavior today.
Why? Gartner’s assessment is based on customer reports. Vendors can yell all they want about how AI is dramatically impacting their customers, but until those customers report observable, dramatic improvements and efficiencies to Gartner, the impact is considered non-existent in the MQ reality.
The AI tsunami is coming, but for now, the market is still focused on the fundamentals: cloud-native scale, effective detection, and fast/good (AND, not OR) response. Getting those right is what puts you in the Leaders Quadrant. The rest is just noise…
Other SIEM MQ 2025 comments can be found here (more to be added as they surface…)
This is an ILLUSTRATION by Gemini, NOT a technical diagram :-)
In the world of security operations, there is a growing fascination with the concept of a “decoupled SIEM,” where detection, reporting, workflows, data storage, parsing (sometimes) and collection are separated into distinct components, some sold by different vendors.
Closely related to this is the idea of federated log search, which allows data to be queried on demand from various locations without first centralizing it in a single system.
When you combine these two trends with the emergence of AI agents and the “AI SOC,” a compelling vision appears — one where many of security operations’ biggest troubles are solved in an elegant and highly automated fashion. Magic!
(Is my math mathing? Cheap + good + fast + AI powered … pick any …ehh… I digress!)
However, a look at the market reveals a conflicting — dare I saw opposite — trend. Many organizations are actively choosing the very opposite approach: tightly integrated platforms where search, dashboards, detection, data collection, and AI capabilities are bundled together — and additional things are added on top (such as EDR).
Let’s call this “EDR-ized SIEM” or “SIEM with XDR-inspired elements” (for those who think they can define XDR) or “supercoupled SIEM” (but this last one is a bit of a mouthful..)
While some suggest this is a split between large enterprises choosing disaggregated stacks and smaller companies opting for closer integration, this doesn’t fully capture the success rates of these different models (one is successful and another is, well, also successful but at a very small number of extra-large, engineering-heavy organizations)
If one were to take a contrarian view (as I will in this post!), it might be that the decoupled and federated approach, with or without AI agents, is destined to be a secondary, auxiliary path in the evolution of SIEM.
This isn’t a nostalgic vote for outdated, 1990s-era ideas (“gimme a 1U SIEM appliance with MySQL embedded!”), but rather a realistic assessment based on past lessons, such as the niche fascination with security data science.
Many years ago (2012), while at Gartner, I wrote a notorious “Big Analytics for Security: A Harbinger or An Outlier?” (archive, repost), and it is now very clear that late 2000s-early 2010s security data science “successes” remained a tiny, micro minority examples. A trend can be emergent, growing tenfold from a tiny base of 0.01% of companies, yet still only reach 0.1% of the market — making it an outlier, not a harbinger of the mainstream future.
Ultimately, the evidence suggests that a decoupled, federated architecture will not form the basis of the typical SIEM of 2027. Instead, the centralized platform model, enhanced and supercharged by AI, will reign supreme (and, yes, it will also include some auxiliary decentralized elements as needed, think of it as “90% centralized / 10% federated SIEM” — a better model for the future).
My conclusion:
SIEM has a future! If you hate SIEM so much that you … rename it, then, well, SIEM still has a future (hi XDR!)
However, decoupled SIEM and federated log search (In My NSHO) are not THE future of SIEM.
I think this because both are just too damn messy for many clients to make them work well. They also fail many compliance tests (well, the federated part, not the decoupled)
AI and AI agents are a very big part of the SIEM future. However, AI agents do not make decoupled SIEM and federated log searchless messy enough (“I didn’t save any logs from X, hey AI agent .. get me logs from X” does not work IRL)
Put another way:
The Romantic Ideal: The theory is that scalable data platforms and specialized threat analysis are dramatically different, so they should be handled by specialists, and modern APIs should make connecting them “easy.” Magic!
The Real Reality: A natively designed, single-vendor, integrated SIEM is inherently simpler and easier to manage and support than a multi-component stack you have to assemble “at home.” It is also faster! AI integrated inside it just works better. With decoupling, also lose the benefit of having a “single face to scream at” when things break. Reality!
Here is my “decoupled SIEM reading list” (all fun reads, obviously not all I agree with):
SOC Visibility Triad is Now A Quad — SOC Visibility Quad 2025
I will be really, really honest with you — I have been totally “writer-blocked” and so I decided to release it anyway today … given the date.
So abit of history first. So, my “SOC visibility triad” was released on August 4, 2015 as a Gartner blog (it then appeared in quite a few papers, and kinda became a thing). It stated that to have good SOC visibility you need to monitor logs (L), endpoint (E) sources and network (N) sources. So, L+E+N was the original triad of 2015. Note that this covers monitoring mechanisms, not domains of security (more on this later; this matters!)
5 years later, in 2020, I revisited the triad, and after some agonizing thinking (shown at the above link), I kept it a triad. Not a quad, not a pentagram, not a freakin’ hex.
So, here in 2025, I am going to agonize much more .. and then make a call (hint: blog title has a spoiler!)
How do we change my triad?
First, should we …
… Cut Off a Leg?
Let’s look at whether the three original pillars should still be here in 2025. We are, of course, talking about endpoint visibility, network visibility and logs.
My 2020 analysis concluded that the triad is still very relevant, but potential for a fourth pillar is emerging. Before we commit to this possibly being a SOC visibility quad — that is, dangerously close to a quadrant — let’s check if any of the original pillars need to be removed.
Many organizations have evolved quite a bit since 2015 (duh!). At the same time, there are many organizations where IT processes seemingly have not evolved all that much since the 1990s (oops!).
First, I would venture a guess that, given that EDR business is booming, the endpoint visibility is still key to most security operations teams. A recent debate of Sysmon versus EDR is a reflection of that. Admittedly, EDR-centric SOCs peaked perhaps in 2021, and XDR fortunately died since that time, but endpoints still matter.
Similarly, while the importance of sniffing the traffic has been slowly decreasing due to encryption and bandwidth growth, cloud native environments and more distributed work, network monitoring (now officially called NDR) is still quite relevant at many companies. You may say that “tcpdump was created in 1988” and that “1980s are so over”, but people still sniff. Packets, that is.
The third pillar of the original triad — logs — needs no defense. Log analysis is very much a booming business and the arrival of modern IT infrastructure and practices, cloud DevOps and others have only bolstered the importance of logs (and of course their volume). A small nit appears here: are eBPF traces logs? Let’s defer this question, we don’t need this answer to reassert the dominance of logs for detection and response.
At this point, I consider the original three legs of a triad to be well defended. They are still relevant, even though it is very clear that for true cloud native environments, the role of E (endpoint) and N (network) has decreased in relative terms, while importance of logs increased (logs became more load bearing? Yes!)
Second, should we …
Add a Leg?
Now for the additions I’ve had a few recent discussions with people about this, and I’m happy to go through a few candidates.
Add Cloud Visibility?
First, let’s tackle cloud. There are some arguments that cloud represents a new visibility pillar. The arguments in favor include the fact that cloud environments are different and that cloud visibility is critical. However, to me, a strong counterpoint is that cloud visibility In many cases, is provided by endpoint, network, and logs, as well as a few things. We will touch these “few things” in a moment.
YES?
Cloud native environments are different, they suppress E and N
Cloud visibility is crucial today
Addresses unique cloud challenges
Cloud context is different, even if E and N pillars are used for visibility
NO, not a new pillar, part of triad already (via all other pillars)
Add Identity Visibility?
The second candidate to be added is, of course, identity. Here we have a much stronger case that identity needs to be added as a pillar. So perhaps we would have an endpoint, network, logs and identity as our model. Let’s review some pros and cons for identity as a visibility pillar.
YES?
Identity is key in the cloud; we observe a lot of things via IDP … logs (wait.. we already have a pillar called “logs”)
By making identity a dedicated pillar, organizations can ensure that it receives the attention
Sorry, still a NO, but a weak NO. Identity is critical as context for logs, endpoint data and network telemetry, but it is not (on its own) a visibility mechanism.
Still, I don’t want to say that identity is merely just about logs, because “baby … bathwater.” Some of the emerging ITDR solutions are not simply relying on logs. I don’t think that identity is necessarily a new pillar, but there are strong arguments that perhaps it should be…
What do you think — should identity be a new visibility pillar?
Now let’s tackle the final candidate, the one I considered in 2020 to be the fourth leg of a three legged stool. There is, of course, application visibility, powered by increased popularity of observability data, eBPF, etc. Application visibility is not really covered by endpoint orgs and definitely not by EDR observation. Similarly, application visibility is very hard to deduce from network traffic data.
YES?
Application visibility is not covered by E and N well enough
SaaS, cloud applications and — YES! — AI agents require deep application visibility.
This enables deeper insights of the app guts, as well as business logic
NO?
Is it just logs? Is it, though?
Do organizations have to do application visibility (via ADR or whatever?) Is this a MUST-HAVE … but for 2030?
Are many really ready for it in their SOCs today?
Verdict:
YES! I think to have a good 2025 SOC you must have the 4th pillar of application visibility.
And, yes, many are not ready for it yet, but this is coming…
So, we have a winner. Anton’s SOC visibility QUAD of 2025
“Google Cloud’s latest research highlights that common hygiene gaps like credential issues and misconfigurations are persistently exploited by threat actors to gain entry into cloud environments. During the first half of 2025, weak or absent credentials were the predominant threat, accounting for 47.1% of incidents. Misconfigurations (29.4%) and API/UI compromises (11.8%) followed as the next most frequently observed initial access vectors.“
THR 12 cloud compromise visual
“Notably, compared to H2 2024, we observed a 4.9% decrease in misconfiguration-based access and a 5.3% decrease in API/UI compromises (i.e., when an unauthorized entity gains access to, or manipulates a system or data through an application’s user-facing screen or its programmatic connections). This shift appears to be partly absorbed by the rise of leaked credentials representing 2.9% of initial access in H1 2025. ” [A.C. — It gently suggests that while we’re making some progress on configurations, the attackers are moving to where the fruit is even more low-hanging: already leaked credentials.]
“Foundational security remains the strongest defense: Google Cloud research indicates that credential compromise and misconfiguration remain the primary entry points for threat actors into cloud environments, emphasizing the critical need for robust identity and access management and proactive vulnerability management.” [A.C. — it won’t be the magical AI that saves you, it would be not given admin to employees]
“Financially motivated threat groups are increasingly targeting backup systems as part of their primary objective, challenging traditional disaster recovery, and underscoring the need for resilient solutions like Cloud Isolated Recovery Environments (CIRE) to ensure business continuity.” [A.C. — if your key defense against ransomware is still backups, well, we got some “news” got you…]
“Advanced threat actors are leveraging social engineering to steal credentials and session cookies, bypassing MFA to compromise cloud environments for financial theft, often targeting high-value assets.” [A.C. — this is NOT an anti-MFA stance, this is a reminder that MFA helps a whole lot, yet if yours can be bypassed, then its value diminishes]
“Threat actors are increasingly co-opting trusted cloud storage services as a key component in their initial attack chains, deceptively using these platforms to host seemingly benign decoy files, often PDFs.“ and “threat actors used .desktop files to infect systems by downloading decoy PDFs from legitimate cloud storage services from multiple providers, a tactic that deceives victims while additional malicious payloads are downloaded in the background” [A.C. — a nice example of thinking about how the defender will respond by the attacker here]
“more traditional disaster recovery approaches, focused primarily on technical restoration, often fall short in addressing the complexities of recovering from a cyber event, particularly the need to re-establish trust with third parties.” [A.C. — The technical recovery is only half the battle. This speaks to the human element of incident response, and the broader impact of a breach.]