Normal view

Received — 21 January 2026 ⏭ Detection Engineering Weekly

Detection Engineering Weekly
DEW #142 - Slack's Agentic Triage Architecture, Detection <3's Data and Sigma evals 21 January 2026 at 13:54

DEW #142 - Slack's Agentic Triage Architecture, Detection <3's Data and Sigma evals

21 January 2026 at 13:54

Welcome to Issue #142 of Detection Engineering Weekly!

Every week, I read, watch and listen to all the Detection Engineering content so you can consume it all in 10 minutes. Subscribe and get a weekly digest of the latest and greatest in threat detection engineering!

✍️ Musings from the life of Zack:

I’m not usually a person who does New Year’s resolutions, but I’ve committed to small changes that have already made a positive impact in my life.
- Using a notebook to take notes and to-dos at work
- Meditate on Headspace for 4 days a week
- Playing video games twice a week. For some reason, I’m back on Dota2 so I’m sure that’ll be helpful for my mental health
There’s a 50/50 chance I’ll make DistrictCon this weekend :( There’s a massive snowstorm hitting Washington, D.C., and as a former Marylander, I can tell you that part of the country cannot handle snow
I’ve been messing with local MCP server development via stdio and HTTP APIs, and I’m starting to shill Claude Code to everyone I talk to. It ripped through a malware analysis at work a week or so ago, and we were able to hunt for IOCs in under 5 minutes.

💎 Detection Engineering Gem 💎

Streamlining Security Investigations with Agents by Dominic Marks

In the age of AI SOCs, it’s still hard to understand where the concept of agentic triage fits into everyday operations. Products tend to present the problem set and solutions in a clean, understandable way. This is a good thing - having a product company frame the space in clear, concise benefits and downsides drives the decision by the security operations team about how much cost they incur in building or buying one.

Blogs like this are showing why our industry is awesome with transparency. Slack's security operations team published its work on building an in-house agent-based triage system. You see many of the same principles and concepts across products, but because there is no moat or trade secrets to protect, there’s a lot more to dig into.

What you see above is their approach to their agent-to-agent orchestration system. The top of the pyramid starts with a director who leverages high-cost models. Thinking models that tend to take their time and deliberate on prompts and results. This makes sense from a planning and analysis perspective.

The critic biases itself to the interrogation of individual analysis from telemetry and alerts. It doesn’t require as much model cost, but it should spend a reasonable amount of time challenging assumptions and analyzing the lower-cost model. It presents the amalgamation of data and investigative output back to the director. The Director is probably thinking mode models, where you spend the most money on tokens to understand whether the bottom parts of the pyramid performed their job correctly. This is the gate between a human and the system, so you want only high-quality analysis moving forward.

The phase transition diagram is super interesting because it puts the above “Director Poses Question..” investigation step into practice.

According to Marks, the Director makes decisions for each part of the phase to see whether it needs to close the investigation or continue it further. The “trace” component is where the Director engages an expert within their architecture to perform additional investigative analyses.

Honestly, it’s hard for me to provide my own analysis here, because the blog is just so complete. So, if you are a person who is skeptical of these types of setups, borrow or steal ideas from this Slack blog and try it on your own. It seems reasonable, and if the idea is that you perform 5 investigations that take 2 hours each, it reduces 3 of them from 2 hours to 10 minutes, and it catastrophically fails on 2 of them, you still saved 6 hours!

🔬 State of the Art

Data and Detect by Matthew Stevens

This post by Stevens dives a bit deeper into the concept of detection observability. In our field, we tend to focus on the research element of rules and detection opportunities, but leave much less conversation about data quality. Remember, there is no rule without telemetry, and there is a concept Stevens points out around data usefulness that I think demonstrates this point perfectly.

Not all sources are the same when it comes to individual atomic qualities for alerting, but when you map them to techniques, you notice that the composite qualities (a sum of many data sources finding an attack chain) become crucial. The graph above, generated by Stephens, shows how important Process Monitoring is for data usefulness. In fact, without Process Monitoring, you lose close to 30% of the techniques you can combine with other data types to alert on.

They also comment on how hard it is to build schemas and normalize telemetry so your teams can operate out of a common lexicon of writing rules. This highlights that a large swath of issues we should deal with it focus heavily on the software and data engineer components of our jobs as equally as the threat research components.

Sigma Detection Classification by Cotool

Continuing Cotool’s research on security AI agent benchmark performances, they setup a website for studying performances on their benchmarks and released a new one on Sigma Detection classifications. The goal of this benchmark was to assess how well foundational models were trained on attack tactics and techniques. The Cotool team fed the full Sigma corpus to 13 foundational models and stripped the MITRE ATT&CK tags to see if they correctly mapped the tags back to the original rule.

Claude’s Opus and Sonnet 4.5 performed the best overall with the highest F1-score and but also the highest cost, ~somewhat similar to what we saw in their last benchmark on the Botsv3 dataset. The team provided their analysis of these placements, their prompts and tradecraft behind the evaluation, so others can run the same benchmarks as well.

5 KQL Queries to Slash Your Containment Time in Microsoft Sentinel by Matt Swann

I have a biased view on what is and what is not a detection rule. Even to the point where I’ve reduced the concept of rules down to one definition: a rule is a search query. There is a rationale behind it: SIEMs and logging technologies require a search query to generate results. But, as I break out of my bubble, I notice that not all search queries have the same value from a detection point of view.

In this post, Swann demonstrates this concept through the lens of a Security Incident Responder. When your goal is containment rather than accuracy or a balanced cost of alerting, accuracy matters less because the goal is to use your analysis skills to find and kick out threat actors as quickly as possible. Swann provides readers with five high-value KQL queries to help responders quickly orient around a potential intrusion. The cool part here is their unique experience in this field, even noting that some queries led to the discovery and containment of an active ransomware actor.

👊 Quick Hits

Detection as Code Home-Lab Architecture by Tobias Castleberry

I love seeing home-lab setups because there are many ways to set up an environment to practice advanced concepts with open-source and free software. This blog is part of a series by Castleberry where they document their journey from an analyst to a detection engineer, and they showcase some of their expertise and how they’ve learned along the way.

Building your own AI SOC? Here’s how to succeed by Monzy Merza

Speaking of demystifying AI SOC and agentic security engineering from Marks’ Gem listed above, this blog by Merza provides an irreverent commentary on the state of building these architectures. There are some non-negotiables Merza points out, such as data normalization, the concept of a “knowledge graph”, and honing foundational models and giving them the right instructions rather than relying on them out of the box.

The Levenshtein Mile by Siddharth Avi Singh

Before the age of LLMs, there was a ton of research and implementation of some pretty clever mathematical techniques to find and detect on threats. I used to work for a threat intelligence product company that specialized in detecting phishing infrastructure, and one of the key elements of finding phishing is understanding what the victim organization owns, so you can see how threat actors try to abuse and socially engineer its customers.

In this post, Singh details the Levenshtein Distance algorithm. The basic premise here is that you can measure the similarity between two strings and generate a score. If that score exceeds some threshold of similarity, you can generate an alert to an analyst and investigate whether or not it is phishing. Domain names are the logical data source here, and you can review them from the public domain registries, DNS traffic, or the Certificate Transparency Log and try to proactively block them before they become an issue.

☣️ Threat Landscape

After the Takedown: Excavating Abuse Infrastructure with DNS Sinkholes by Max van der Horst

This post by van der Horst helps readers understand what happens after a domain is sinkholed. We typically see news stories about a large botnet or ransomware operation being taken down, and the takedown includes seizing domain names used for command-and-control communications with victims. High fives and good vibes happen and then we focus on the next big thing.

van der Horst challenges this finality and tries to argue that a sinkhole is more than just an interruption operation; it’s also a forensic artifact that helps discover more victims and additional malicious infrastructure. They downloaded several datasets, combining passive DNS and open-source intelligence feeds, to understand the rate of disruptions and how to perform temporal analysis of these takedowns to discover unreported infrastructure.

It also allows analysts to cluster activity and create new detections as new botnets or campaigns emerge, where many cases involve the reuse of code and infrastructure techniques.

How to Get Scammed (by DPRK Hackers) by OZ

This is a great article showing an individual infection chain done by a Contagious Interview threat actor. OZ accepts the bait on Discord and walks through how the DPRK-nexus threat actor tries to infect him by taking a malicious coding test. OZ brings receipts: there’s a lengthy Discord conversation where the threat actor prods OZ and eventually convinces them to apply for the job.

There’s some cool analysis with cloning the repository and using docker and pspy to inspect the malicious traffic.

What’s in the box !? by NetAskari

NetAskari, a security researcher, stumbled upon a Chinese-nexus threat actor’s “pen-test” machine and managed to download a bunch of their custom tooling for analysis. The Chinese hacker ecosystem is in a bubble, the result of both cultural and artificial barriers imposed by the PRC. These barriers create opportunities to build tooling, exploits, and software in a silo, so when you find a goldmine of tooling available for download, it’s always great to download it and see how other hackers are performing operations.

They found a litany of post-exploitation tools, some of which are custom-written and look similar to the likes of Cobalt Strike or Sliver, a bunch of custom Burp Suite extensions, and some malware families, like Godzilla, that were used in nation-state operations against the U.S.

Dutch police sell fake tickets to show how easily scams work by Danny Bradbury

I think phishing simulations at a professional organization is lame, but I actually think it works at scale against the general populace as a form of education. Apparently, the Dutch Police thought the same. They set up a fake ticket sales website and bought ads to trick victims into visiting and purchasing tickets for sold-out shows.

Tens of thousands of people visited the website, and several thousand people bought tickets, which is a wild stat if you want to steal some credit cards. Obviously, the Police did not steal credit cards; they used them as an educational opportunity to help folks understand the risks of online ticket fraud.

CVE-2025-64155 Fortinet FortiSIEM Arbitrary File Write Remote Code Execution Vulnerability by Horizon3.ai

From the blog:

CVE-2025-64155 is a remote code execution vulnerability caused by improper neutralization of user-supplied input to an unauthenticated API endpoint exposed by the FortiSIEM phMonitor service. Oof. I couldn’t tell any of you the last time I’ve seen remote code execution vulnerabilities in SIEM technology.

The specific service, pMonitor, listens on 7900. It serves as the control plane for these devices, much like the Kubernetes control plane, and supports orchestration and configuration API calls. I ran a quick scan of likely FortiSIEM devices on Censys and found over 5000 publicly facing servers.

This blog has some details on the vulnerability, and, as with most FortiGuard and edge device vulnerabilities, user-supplied web request data with complex string parsing leads to a command injection deep within the application code.

🔗 Open Source

MHaggis/Security-Detections-MCP

Locally run MCP server for detection engineering. Leverages stdio transport so nothing leaves your machine which is always good if you are writing rules or queries in a sensitive information. It exposes 28 tools where a local LLM client (Claude, Cursor) can look at detection coverage, MITRE classification, KQL queries and data source classification.

SeanHeelan/anamnesis-release

PoC of an LLM exploit generation harness. The README has an extensive background on how they approached benchmarking Claude Opus and GPT 5.2 with no instruction on how fast they can analyze a vulnerability and generate exploit code. They introduced several constraints in test environments to challenge the models, such as removing certain syscalls, adding additional memory and operating system protections, and forcing the agents to generate an exploit with a callback.

tracebit-com/awesome-deception

Yet another awesome-* list on deception technology research, open-source repositories and conference talks.

mr-r3b00t/rmm_from_shotgunners_rmm_lol/main/mega_rmm_query.kql

This repository caught my eye because I’ve never seen a rule that started with the word “mega”. And when I mean mega, I’m thinking a few hundred lines for something pretty complicated. But this RMM detection query rule is 3000 lines long. Can you imagine needing to tune this?

ineesdv/Tangled

This is a clever phishing simulation platform that abuses iCalendar rendering to deliver legitimate-looking phishing invites. It leverages research from RenderBender, which abuses Outlook’s insecure parsing of the Organizer field.

Received — 14 January 2026 ⏭ Detection Engineering Weekly

Detection Engineering Weekly
DEW #141 - K8s Detection Engineering, macOS EDR evasion, Cloud-native detection handbook 14 January 2026 at 14:03

DEW #141 - K8s Detection Engineering, macOS EDR evasion, Cloud-native detection handbook

Detection Engineering Weekly

By: Zack Allen

14 January 2026 at 14:03

Welcome to Issue #141 of Detection Engineering Weekly!

✍️ Musings from the life of Zack:

It was a long but restful month away from you all! I can’t wait to get back into writing every week for y’all
🤝 I am accepting new sponsors for 2026! If you are interested in sponsoring the newsletter, shoot me an email at techy@detectionengineering.net. We are already almost halfway booked for Primary slots and now have Secondary slots so you have options!
I’ve started writing again for the Field Manual and I really love encapsulating my experience and knowledge into these posts. If you have ideas for Field Manual posts, comment below. I have my latest post below as the last story under State of the Art

This Week’s Primary Sponsor: Push Security

Want to learn how to respond to modern attacks that don’t touch the endpoint?
Modern attacks have evolved—most breaches today don’t start with malware or vulnerability exploitation. Instead, attackers are targeting business applications directly over the internet.
This means that the way security teams need to detect and respond has changed too.
Register for the latest webinar from Push Security on February 11 for an interactive, “choose-your-own-adventure” experience walking through modern IR scenarios, where your inputs will determine the course of our investigations.
Register Now

💎 Detection Engineering Gem 💎

A Brief Deep-Dive into Attacking and Defending Kubernetes by Alexis Obeng

For detection engineers, incident responders, and threat hunters who operate in a cloud-first environment, you probably heard developers in your organization talk about Kubernetes (k8s for short). It’s an extremely popular container orchestration framework that has been used as the de facto standard for controlling scaling, application isolation, and cost. Whether you have it in your environment or you’ve never worked with it, it’s important to note how important the security controls and detection opportunities work inside these environments, because it’s like an operating system of its own.

When Obeng first shared this research on a Slack server I was on, I was excited to read it because it’s truly a deep dive into Kubernetes security, as the title suggests. She started the blog by describing how unfamiliar this space was, and by the end, you could tell Obeng had become very familiar with detection and hunting scenarios in Kubernetes.

The blog starts with an introduction to k8s and breaks down the jargon, architecture, and nuances of how a Kubernetes environment operates. The most important thing I try to get folks to understand with k8s is that it’s separated into two detection planes. The control plane, as Obeng explains, “is the core of Kubernetes.” It helps control everything from scaling plans, what containers to run, permissions, and health checks.

The other plane, the data plane, is everything else. The hyperscalers describe this as the service’s core functionality. Since k8s’ functionality revolves around running containers, you could argue that it’s about each individual container and the isolation of those containers within k8s.

As you can see from the threat matrix, attacks along MITRE ATT&CK operate in both planes.

After giving this introduction, she jumps into several attack scenarios. But the start of this scenario section first describes her description of the k8s attack surface. This is my favorite part of the blog. Obeng outlines four major scenarios you’ll see in any k8s attack: pod weaknesses, identity and access mechanisms, cluster configuration, and control plane entry points. Notice these are focused on the control plane as the end goal. So, if you can compromise any part of the data plane, for the most part, the main goal is to attack the control plane afterward.

She ends the blog with close to 10 attack scenarios, detection rules using Falco, and a follow-up with her lab for folks who want more hands-on learning.

🔬 State of the Art

EDR Evasion with Lesser-Known Languages & macOS APIs by Olivia Gallucci

~ Note, Olivia is my colleague at Datadog ~

EDR blogs from independent researchers are hard to find. It’s not that the blogs are tucked away in dark corners of the Internet, instead, EDR researchers who don’t work at vendors are few and far between. So, anytime I get to see research that goes deep into the EDR space, I pay close attention.

This is especially true for the macOS world. Microsoft has years of security solutions and a litany of researchers who document all kinds of peculiar malware and EDR behavior. This is logical, since most major security incidents over the last 30 years have been on Windows platforms. But in the last few years, attackers have shifted their focus to macOS. The opaqueness-by-design of EDR vendors AND Apple makes it hard to learn about security internals on this platform.

This technical analysis by Olivia helps break down those barriers by first describing the ecosystem of opaqueness of macOS combined with security vendor technologies. From my understanding (and with lots of stupid questions from me to Olivia), rely on the extended security (ES) system, which is somewhat equivalent to Linux’s eBPF observability and security framework. Security vendors subscribe to security events, build detections over them, and implement EDR security response features, such as blocking a piece of malware from executing.

This has its limitations, and Olivia’s analysis under her “Technical Analysis” section points them out. It’s reminiscent of the early days of Microsoft security, when bypasses emerged from malware families, and it took a lot of effort for vendors and Microsoft to respond to them. The closed ecosystem has it’s advantages from a security controls perspective, but IMHO, it starts to do a disservice to organizations when attackers move faster than the controls you try to implement.

The Cloud-Native Detection Engineering Handbook by Ved K

This post is an excellent follow-up to Abeng’s blog, which is under the Gem at the top of the newsletter!

Detection engineering is much more than building detection rules. There are elements of software engineering, data analysis, and threat research that separate a good detection engineer from a great one. I’ve talked about this across my publication, podcasts and conference talks. But, if you want a deep dive on the how to wear and implement these skillsets, Ved’s blog is a great resource to do so.

Ved defines cloud-native detections as any research, engineering and implementation of a detection rule to identify threat activity in cloud environments (AWS, Azure, GCP) and Kubernetes. He then describes his nine-phase (!) approach to writing detections, and opens each subsection with what “hat” you should be wearing.

The value of this post lies in the diligence put into each phase, especially in the use of real-world examples. They are bite-sized sections so that I wouldn’t be phased (ha!) out by the number. It serves more as a handbook for you to reference as you move through the detection lifecycle.

My favorite section is under Phase 4, titled “Enrichment and Context.” It ties nicely with my piece about context and complexity within rules, and according to Ved, it does require a Software Engineering Hat. Ved lists out five critical pieces of context to help increase the efficacy of rules:

Identity Context: who is this (human) or what is this (service-account).
Threat Intelligence: what IP addresses, domains, or general knowledge around indicators of compromise do we have to help make decisions on this activity?
Resource and asset metadata: What critical asset inventories, compliance tags or posture related information exists to help identify the riskiness of this asset being attacked?
Behavioral baselines: is this normal behavior for this type of activity? Think Administrator activity at 2am on Saturday.
Temporal context: Attacks aren’t point-in-time, they are over a period-of-time. Can you enrich this alert with other context of events before it occurred?

Ved finishes the rest of the post, writes a detection, tests it, follows it through deployment, and sees how useful the alert is. It looks like this is his first post on his Substack, so I recommend subscribing!

How to defend an exploding AI attack surface when the attackers haven’t shown up (yet) by Joshua Saxe

This is a fantastic commentary on what happens when the security community knows that a new technology is going to bring all kinds of security issues, even though the issues haven’t materialized yet. Saxe’s framing revolves around the growing attack surfaces around AI technologies. It’s hard to parse marketing-speak and LinkedIn ads and messages from startup founders and salespeople claiming that “the bad guys are already using AI at scale to attack you!!11” without much proof. Perhaps they reference a news article about some basic usage of vibecoding malware, or a phishing site that has an HTML comment of “created by Claude Code.”

Saxe has recommendations around what security functions and specific teams can do to help prepare for this, and I will steal his framing around making controls and policies “dialable”. Security should aim to be enablers rather than disablers for our engineering and technology counterparts. So, build controls in security engineering, and implement detection & response processes, but configure them in a way so you can “dial up” the strictness as we see new attacks emerge from real scenarios rather than theoretical ones.

Introducing Pathfinding.cloud by Seth Art

~ Note, Seth is my colleague at Datadog ~

Seth recently released a comprehensive library on privilege escalation scenarios and techniques abusing IAM in AWS environments. There are 65 total paths, and 27 of them are not covered by existing OSS tools to test coverage. That good news is that the website has the description of each attack and how to perform it, as well as a helpful graph visualization so you can see the traversal rather than try to create an image in your head.

📔 Field Manual

I wrote a Field Manual issue on Atomic Detection Rules over break! Please go check it out!

☣️ Threat Landscape

The Mac Malware of 2025 👾 by Patrick Wardle

This blog is a comprehensive look back at Mac Malware incidents and research throughout 2025. Maybe I am showing my age, but if you told me 10 years ago that macOS’s popularity is going to explode in cybercriminal groups, leading to large scale compromises, I would laugh at you. Wardle lists out the top malware families, some associated incidents and blogs dissecting the malware, as well as walk through analysis of the malware using an open-source toolbox.

Researcher Wipes White Supremacist Dating Sites, Leaks Data on okstupid.lol by Waqas Ahmed

lmao

🌊 Trending Vulnerabilities

MongoDB Server Security Update, December 2025

I’m a bit late on this one due to holidays and time off, but MongoDB recently disclosed a critical vulnerability dubbed “MongoBleed” under CVE-2025-14847. It allows an unauthenticated attacker to connect to a MongoDB instance and leak memory contents, which potentially contain sensitive information around data inside Mongo, authentication data and cryptographic data.

I’m impressed with the transparency and diligence in the post. MongoDB found the vulnerability internally, validated it, built a patch, notified customers and rolled out a post. A researcher at Elastic published a PoC two days later (on Christmas, no less) that I’ll link below.

Ni8mare - Unauthenticated Remote Code Execution in n8n (CVE-2026-21858) by Dor Attias

n8n is an open-source workflow framework to build Agent-to-Agent systems. They recently disclosed two vulnerabilities, CVE-2026-21858 and CVE-2026-21877, a 9.9 and 10.0, respectively. n8n itself has skyrocketed in popularity primarily due to it’s ease of use for interfacing with Agentic workflows and platforms. The .1 difference is 21858’s arbitrary file read, which could allow reading secrets from a target system, and full remote code execution on 21877.

I really enjoyed the technical detail of this post by Attias, focused on the arbirary file read vulnerability. When you think of arbitrary file reads in a modern application stack like n8n, you can pull a lot more credentials that give you access besides dumping password files. Attias created a clever scenario on reading in arbitrary sessions and loading it into n8n’s knowledge base, allowing the extraction of the key from the chat interface itself.

🔗 Open Source

heilancoos/k8s-custom-detections

Kubernetes lab environment and corresponding detection rules from Obeng’s gem above.

appsecco/vulnerable-mcp-servers-lab

Hands-on lab for testing security vulnerability knowledge against MCP servers. There are nine scenarios, and each one looks pretty reasonable in their real-world applicability. You’ll need Claude and python to run each one, and luckily with MCP, you can specify the singular Python file within the Claude config and get everything you need to get started.

Adversis/tailsnitch

Tailsnitch is a posture management tool for Tailscale configurations. You give it a Tailscale API key and it’ll connect to your tenant’s API and compare it’s configuration to secure baselines.

joe-desimone/mongobleed

Original PoC of CVE-2025-14847, a.k.a MongoBleed, dropped right on Christmas :|. Has a docker-compose file so you can safely test it yourself.

kpolley/easy-agents

This is a nice example of what I think will be a normal detection and response engineer’s setup in the next few years. Your org will operate a repository with agent setups for technology like Claude code, and it’ll contain a standardized list of MCP servers to use and agent instructions. Making it extendable to tweak or add agents and MCP servers should be as easy as another prompt and some glue work for a custom MCP.

Received — 11 January 2026 ⏭ Detection Engineering Weekly

Detection Engineering Weekly
What are Composite Detections? 7 January 2026 at 02:48

What are Composite Detections?

Detection Engineering Weekly

By: Zack Allen

7 January 2026 at 02:48

Atomic Detection rules are critical building blocks for a detection engineering function. They provide visibility into singular event or indicator-based threat activity within an environment. The rules are narrow in scope and generally lack context for the blue teamer’s environment and the threat actor performing the malicious action. For example, an atomic detection rule can inspect Administrator logon activity in a cloud environment and generate an alert whenever an Administrator logs in. This captures malicious admin compromises (high recall), but also triggers on every legitimate admin login (low precision), flooding analysts with false positives.

This tradeoff also works in the opposite direction on the precision-recall spectrum. A detection engineer can deploy an atomic rule that is so precise it becomes brittle. It may never generate an alert because the fields it tries to capture are so specific that they offer low operational value.

The Detection Engineering Field Manual is a series dedicated to sharing knowledge and my experience building, operating and scaling a detection engineering organization at a F500 tech company. Please like and subscribe if you find this series useful!

The answer to combat these types of detections is to increase the context around the attack itself. This means capturing more threat activity to group atomic detections together, as well as increasing the context of the environment to differentiate benign and malicious activity. Composite detections, also known as correlated or stateful detections, increase the context and, therefore, complexity of writing and maintaining the rule.

This field manual post covers (ha!) the pros and cons of composite detection rules and begins to explore strategies to expand context around threat activity.

Detection Engineering Interview Questions:

What is the MITRE ATT&CK?
What is a composite detection rule?
Explain a threat activity scenario where a composite detection rule helps reduce false positives?
How do composite rules increase operational complexity for a detection engineer?

MITRE ATT&CK

MITRE ATT&CK (pronounced “MY-ter AT-ack”) is the industry standard for modeling threat activity. According to their main website:

“MITRE ATT&CK® is a globally-accessible knowledge base of adversary tactics and techniques based on real-world observations. The ATT&CK knowledge base is used as a foundation for the development of specific threat models and methodologies in the private sector, in government, and in the cybersecurity product and service community.”

There is no modern detection engineering and incident response without MITRE ATT&CK. It serves as a lexicon for security engineers across red and blue teams to standardize on how a specific attack occurs and the telemetry it generates.

Tactics are along the X axis and represent the stages an attacker traverses to achieve an objective, such as exfiltrating sensitive data, deploying ransomware, or causing a denial-of-service attack. Ransomware deployment is the end goal, but it requires a lot of steps to achieve that impact. For example, getting access to a victim machine, laterally moving to a domain controller, collecting secrets and cracking administrator passwords, and finally finding a way to deploy the ransomware.

The Techniques are the Y-axis under each Tactic. Techniques are the how: specific methods adversaries use within each tactic to achieve their objective. For example, Network Share Discovery under Discovery is used by attackers to find interesting files, folders and target machines connected to the current machine. They can leverage this to perform Collection of sensitive information and perform Lateral Movement to a higher privileged victim machine.

The beauty of MITRE ATT&CK is that it directly contradicts the adage “attackers only need to be right once, defenders have to be right 100% of the time.” Each technique listed above has associated telemetry, detection opportunities, and some even have threat groups that leverage the documented techniques.

What does this have to do with Composite Detections?

In my last post on Atomic Detections, I talked about how Atomic Detection rules lack context. These rules can use threat intelligence, such as malicious IP addresses, to generate alerts, but those IP addresses can be rotated, making the rule very noisy. So you wouldn’t want to write that rule unless it existed in the same window where the IP address remains malicious.

On a separate Atomic Detection rule, a detection engineer can write a rule to alert on Network Share Discovery. This is an obvious choice from my example before: the next logical step after Network Share Discovery is Lateral Movement. We want to detect that, right?

The problem here, again, becomes context. What if a legitimate process, such as a File Search or Data Backup tool, performs Network Discovery? You generate an alert, block the activity, and just killed productivity or a critical business process for one of your users. Does this mean you need to painstakingly investigate every Network Discovery alert? You could, but you would burn out, and the operational costs would be too high.

This is where Composite Detections can help, and where MITRE ATT&CK enables context via chains of events. By correlating Network Share Discovery with subsequent Lateral Movement attempts, we filter out benign activity and surface actual threats.

Composite Detections Tell a Story

Let’s continue to challenge the adage “attackers only need to be right once, defenders have to be right 100% of the time.” We know that writing one Atomic Detection rule can be noisy. So what if you write two? What if you write these rules across every single path along MITRE ATT&CK, under every Tactic? You would have high recall, but terrible precision, and a flurry of alerts that can’t discern between benign and malicious activity.

Let’s look at an example from our previous post on Atomic Detection Rules:

In this scenario, the Atomic Detection rule fires on administrator login activity. We are only looking at the event and ignoring sourceIP, timestamp, and location. These can help tell the story, but the story stops on the singular event. You could write some additional enrichment to tell the story that:

The Admin is logging in from a risky location, let’s say outside the U.S. for the sake of example
The Admin is logging in past business hours

But these enrichment points can also be part of legitimate business activity. This is where context comes into play.

Let’s say you have two other rules that capture potential threat activity of an Administrator creating a second account and attaching an Administrator policy or profile to it. It’s riskier (it’s further along the ATT&CK chain), but it lacks context. But what if you combine the threat scenarios and create a story?

Here’s the story: an Administrator account gets compromised, and an attacker runs a script to log in to your AWS portal automatically. They are smart cookies and believe in another adage, “two is one, and one is none,” and create a second account to achieve Persistence on your account. They then leverage their Administrator privileges to attach an Administrator policy. Smart, if you reset the original Administrator password, they have a backdoor back into your environment!

By combining the three scenarios via the following rule, in pseudocode:

if user contains 'admin'
AND CreateUser action is called
AND AttachUserPolicy is called and the Policy = 'Admin'
THEN alert

You’ve told your SIEM quite a compelling story to look out for, and it found it!

There are some key questions from the above rule, and they emerge from the other data I’ve omitted from my diagram:

What is a legitimate amount of time between logging on and calling CreateUser?
Is calling CreateUser then attaching an Administrator policy malicious?
Does this Admin typically CreateUser and attach policies?

These questions are what adds complexity and cost to writing and maintaining a ruleset. So, a detection engineer must weigh the cost of this complexity versus the cost of false positives from Atomic rules.

In this specific Composite rule, we used Windowing. Windowing is a technique in which we capture activity in time windows and assume that any Composite detection that captures events within that window must be the result of threat activity. The rule assumes that if an Administrator account logs in, creates a secondary account, and attaches a privileged policy to it, it must be malicious. This reduces false positives by:

Combining three Atomic rules into one rule
Creates a story where these three actions together means something malicious is happening, or requires investigation
Assumes threat actors will try to do this quickly as their access may be revoked within a few minutes

Stories increase complexity

I linked a chart in my previous post about the trade-off between context, operational cost and false-positive reduction.

In this Windowed Composite Detection Case, there are several costs that detection engineers incur:

Does my SIEM technology support Windowing?
Does the combination of these detection rules capture the threat activity that I want? For example, should I also have a separate atomic rule for CreateUser to catch persistence attempts that don’t fit the 5 minute window? This can lead to false negatives if you only rely on composite rules.
Does the window period give me the best value? If I increase it to 15 minutes, what costs do I incur on server usage, indexing and other infrastructure components?

I will say that Detection Engineers I’ve hired, worked with, and spoken with at other companies spend as much time researching cost trade-offs as they do performing pure security research. This is the Engineering component of threat detection, and to me, these types of problems are what make the field exciting. You are part security researcher, part engineer, and part data scientist!

Conclusion

Composite detections shift detection engineers’ focus to reduce false positives by creating stories of attack chains. MITRE ATT&CK is the de facto industry standard for documenting how an attacker progresses through a breach to achieve an objective. Detection engineers can use ATT&CK to build atomic and composite rules to capture threat activity.

Atomic rules lack context by design, but when combined with other atomic rules via composite detections, you can start building a story of an attack. This story is the context you want to decide on whether you should investigate an alert. This story also reduces false positives by capturing the logical progression an attacker may take in your environment, and reduces the likelihood of alerting on benign activity.

The complexity of creating and maintaining composite detections stems from technological capabilities, such as windowing, as well as the hidden costs of assumptions made by the detection engineer. For example, combining three distinct events into a composite detection may miss other alerting scenarios within those events, leading to a false negative.

In the next Field Manual post, we'll explore different alerting mechanisms for composite and atomic detections outside of windowing.

Detection Engineering Weekly
What are Atomic Detection Rules? 15 December 2025 at 15:55

What are Atomic Detection Rules?

Detection Engineering Weekly

By: Zack Allen

15 December 2025 at 15:55

In the last post, we discussed the tradeoffs in designing effective rules. Detection efficacy captures the needs of the consumer of your detection rules, because the persona can be more concerned with missing an alert (false negative) or having too many alerts that don’t matter (false positives).

Finding attacks is the core value proposition of what detection engineers do, and it’s what makes this field technically challenging. Although difficult, this work has an art and aesthetic that is hard to find anywhere else in security. This is because you aren’t solving a machine-to-machine problem, but a human-to-human problem, and the other human is unwilling to cooperate with you. To me, detection engineering and blue teaming, overall, are studies of behavior.

Detection Engineering Weekly is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

In this post, we’ll begin looking at how rules detect threat activity through atomic detections.

Detection Engineering Interview Questions:

What is the Pyramid of Pain?
What is an atomic detection rule?
Compare and contrast scenarios where an atomic detection rule can be effective or ineffective.
What is environmental context?

David Bianco’s Pyramid of Pain

Some attacks generate telemetry that is easy to identify as an attacker on your system or networks. Many attacks, however, require logic that depends on telemetry availability, environmental context, index windows of logs arriving at the SIEM, and understanding of attacker tradecraft or behavior.

Much as detection engineers must consider operational costs when writing rules, threat actors incur costs when carrying out attacks. This cost-versus-cost battle helps frame attack and defense so you can impose as much cost as possible on an attacker’s operations, so they’re in so much pain they deem a tactic or technique not worth their time. This is where the “Pyramid of Pain” by David Bianco becomes a valuable exercise for security teams.

https://detect-respond.blogspot.com/2013/03/the-pyramid-of-pain.html

At its core, the Pyramid of Pain challenges defenders to focus on imposing as much pain on attackers. As you traverse the pyramid, operational cost to your efforts increases, but the amount of pain you cause to an attacker also increases. Each layer of the Pyramid represents an operational complexity for the threat actor to consider when staging an attack. The ideal state of detection is at the top: if you detect Tools executing in your environment, your detections are more robust because the order and context of the tool’s execution become irrelevant.

The best state is under “Tactics, Techniques and Procedures” (TTPs). This layer focuses on the behavioral aspect an attack. If you detect behavior of an attack, every layer below the pyramid become less relevant in your detection (for the most part), and the detection is robust enough to catch changes in Tools, Artifacts, Domains, IP addresses and hashes.

Imagine this: you write a rule that helps detect a known Command-and-control (C2) server you read from a blog post. You deploy that rule and it doesn’t find anything. Great, you aren’t compromised, and you’ll have great coverage for the future if there is a compromise.

Here’s the problem: threat actors are well aware that we find C2 servers, build rules, share with the community and blog about them. A C2 server is typically either an IP Address or a Domain. Have you ever rented a droplet on Digital Ocean, or bought a domain from Namecheap? You can spend a few dollars to rent more droplets or buy new domains. This requires minimal pain on the threat actor’s side, and defenders no longer block your new C2 server until it is discovered again.

Even worse, the IP address you wrote a rule for is now leased to a benign client, and it is now alerting on benign traffic, causing pain to you and your team.

So, how effective is your detection rule now? Not too effective! This is because detecting on a singular value, such as an IP address or a domain, is an Atomic Detection. Atomic Detections are narrowly defined rules that detect activity at a point in time with little to no context. Let’s dive into them in the next section.

Atomic Detections Lack Context

Atomic Detections are tactical in nature. They may seem precise in practice, but because they lack context from the environment and incur little pain for attackers, they become brittle and prone to false positives. As soon as an attacker changes their infrastructure or flips one bit in a new build of their malware, which changes the cryptographic hash value, your rule diminishes in quality.

Atomic Detections also exist for computer or network activity. The point here is that ignoring context in an environment, such as rules that don’t evaluate time signatures, environmental context, or regular activity, makes atomic rules risky to deploy.

Let’s look at a basic alerting example with Amazon AWS Administrator login activity.

The rule is in purple and only alerts on Log activity where the user field value is admin. The SIEM correctly identities the user field containing admin three times . The 11AM alert is a true positive: the administrator credentials were compromised. The other two are false positives, indicating normal administrative work. To make things worse, the compromised login was during normal business hours.

So how do you differentiate between the three alerts?

You differentiate them by spending incident response cycles investigating each one. Now imagine 100s or 1000s of these being generated. The atomic rule strategy doesn’t work because there is little to no context on the event.

The same thing can be said for IP-based C2 alerting.

In this example, the detection engineer wrote an atomic detection rule for a known C2 IP address. Perhaps they read a blog some time around December 10 and added it quickly to find exposure. Log 1 enters the SIEM; the rule checks the destination field and generates a true-positive alert.

Fantastic! Let’s keep the rule!

The C2 was removed by the leasing company that owns it on December 11 due to the blog post. On January 15, a content delivery network leases an IP address, and network traffic logs flow through the SIEM, triggering an alert. Each subsequent network log afterward is a false positive.

The context from both of the graphs above is under the UNUSED field in the purple box. Associated domains, timestamps and physical location are all useful fields to add into the atomic rule to increase robustness of the rule and remove false positives. It would make sense, then, to start including all of these in your detection rule. Detection engineers need to understand the relationship between detection context and cost.

Imposing cost on ourselves

As we progress the Pyramid of Pain and add context to your ruleset, the cost increases. Cost can depend on time, resources, maintenance, or the technology needed to add context, such as threat intelligence. The following graph tries to explain this causal relationship:

At the bottom left, you could deploy a rule similar to the examples above. Because the operational cost of matching on a single value is low, the context is low. And because the context is low, the risk for false positives is high. As you add context (move to the right), the cost increases, but the false-positive rate decreases.

This is why not every rule can be perfectly accurate. There is a cost-benefit tradeoff, as well as information asymmetry from attacker behavior, that detection engineers must consider. The only way a rule can catch all threat activity is to alert on every piece of activity. That seems costly!

Conclusion

Atomic detection rules generally focus on low-context events or values. They can certainly help a blue team function, such as a SOC or a Detection & Response team, and they have a place in security operations. They risk generating many noisy alerts when the detection engineer fails to account for a threat actor’s behavioral patterns.

The Pyramid of Pain and imposing cost are industry-accepted concepts that help contextualize the competing objectives of blue teamers and threat actors. Writing rules to alert on the bottom parts of the pyramid, which primarily involve threat intelligence indicators (IP addresses, domains, hash values), imposes a greater cost on defenders than on threat actors. Defenders impose more pain on threat actors by climbing The Pyramid and writing rules that detect tools and TTPs.

For the next few parts of this series, I’ll explain the different ways detection engineers can write rules to capture threat actor behavior and the associated operational complexity.

Detection Engineering Weekly is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Detection Engineering Weekly
DEW #140 - SVG Filter ClickJacking, Detection Engineering "Onboarding" and React2Shell spotlight 10 December 2025 at 14:03

DEW #140 - SVG Filter ClickJacking, Detection Engineering "Onboarding" and React2Shell spotlight

Detection Engineering Weekly

By: Zack Allen

10 December 2025 at 14:03

Welcome to Issue #140 of Detection Engineering Weekly!

✍️ Musings from the life of Zack:

I’m in Paris this week after a quick personal trip to London. None of you told me that there are more people walking around in the West End than Manhattan!
I managed to get some great BJJ training in while in London, and tried cold plunging for the first time ever. Low key it’s amazing
This issue is vulnerability writeup forward. But, I’m happy for it, because I think people in blue team roles need to see and understand the inner workings of malicious, unintended code paths. IMHO it makes me a better security engineer

Primary Sponsor: Permiso Security

ITDR Playbook: Detect & Respond to Suspicious Authentication Patterns
Credential compromise now drives more than half of today’s breaches—and most teams still miss early warning signs. This Identity Threat Detection & Response Playbook breaks down the highest-value authentication anomalies and provides actionable detection and response steps your team can implement immediately. Strengthen identity defense where it matters most.
Download the Playbook

💎 Detection Engineering Gem 💎

SVG Filters - Clickjacking 2.0 by lyra

I wrote a blog about abusing Open Graph previews 7 years ago for phishing. The idea was that you could abuse how browsers render preview links to display one thing while redirecting to another. I’ve always tried to find a term or phrase to coin this style of attack. It’s not malware or phishing, but similar to IDN homograph attacks, it provides a confusing user experience for the victim. And within that confusing experience, you can socially engineer them to click into whatever malicious URL you want.

ClickFix became a huge hit for threat actors between last year and this year, and it abused this same concept. You are presented with instructions to copy and paste something into your terminal to download some piece of software or fix a bug. But by abusing how clipboard interactions work with a website, the user thinks they are copying and pasting a benign command, and they instead paste a malicious payload.

Lyra’s blog follows the same confusing user experience style, but this time, doing some fun things with SVG rendering. They got their original idea after Apple announced the Liquid Glass redesign, and wanted to recreate some of that experience in the web browser. After tinkering with some of the SVG Filter Effect primitives, they tried applying these effects over an iFrame, and whoops! It worked.

The reason this was so interesting to me is that my liquid glass effect uses the feColorMatrix and feDisplacementMap SVG filters - changing the colors of pixels, and moving them, respectively. And I could do that on a cross-origin document? - Lyra

The first demonstration was a PoC on layering these types of effects over an iframe for a sensitive one-time password code. You’d be an attacker, load the OTP frame inside an iframe, then trick the user to paste the code back into what they think is the legitimate site, but it’s an SVG element on top. They dubbed this style of attack ClickJacking.

This isn’t the most interesting part, it gets better! These <fe*> elements have some mathematical capabilities to help compute everything from masks to filters. Due to the nature of this attack, most of the logic has to occur inside the <fe*> elements, because you cannot extract pixel data from an SVG filter back into JavaScript or the DOM. So how do you create a multi-stage attack?

Well, why not make these elements functionally (not Turing) complete and create a limited-but-effective state machine inside the filters? That’s obvious, right, Zack? ←Lyra, probably, as they did this

Lyra made a logic-gate example to demonstrate this, but by applying a multi-stage filter mask to a victim iFrame, they successfully showed how they can perform this SVG ClickJacking attack within a state machine rendered solely from these <fe> elements. Here’s an ASCII art example of the QR code attack with exfiltration:

The cross-origin part worries me the most here, because they essentially figured out how to overlay and extract data from the attack without breaking CORS.

They demonstrated this attack against Google Docs and were awarded a good sum of money for doing so. Video here:

https://infosec.exchange/@rebane2001/115265287713185877

I don’t know how you’d detect this on the browser, and you could have some exfiltration-style detections to work with once the data leaves the machine. UX Confusion strikes again!

🔬 State of the Art

Why the MITRE ATT&CK Framework Actually Works by John Vester

I read a lot of blog posts introducing MITRE ATT&CK to readers. I think it’s a great first topic for folks getting into the industry, because ATT&CK is such a staple for us. My biggest feedback on these blog posts is that they aren’t really offering anything new for readers. This isn’t a bad thing, since the content shouldn’t change too much, but Vester’s blog here is comparatively different from the others I have read.

The blog starts with the typical introductory content on MITRE ATT&CK, but in the “Real-world ATT&CK” section, Vester begins describing ATT&CK as a practitioner who has been doing this for years. They do this by looking at how ATT&CK looks when overlayed with detection rules inside Sumologic.

I appreciate this approach because it feels like Vester is a senior engineer, you are onboarding to a new company, and they are giving you the experienced perspective on the whole system. ATT&CK has lots of faults and a lot of its criticism is pointing at its real-world applicability. Luckily, Vester shows where it works really well and where it doesn’t necessarily work. This type of balance is what makes ATT&CK useful; it’s a tool rather than a full-fledged solution.

Understanding the Nuances of Detection by Danny Zendejas

Maybe I’m stuck on this idea of reading blogs as if I’m onboarding to a new company, but Zendeja’s blog about Detection Nuances here is a great follow-up blog to Vester’s above.

We take a lot of time jumping straight into rules and ATT&CK, but taking time to understand the logistics of detection engineering matters just as much. For example, Zendejas laid out the general architecture for SIEM, and then introduced readers to the types of formats and standards dedicated to search languages and rules.

Understanding and navigating these formats effectively is a fundamental part of a Detection Engineer’s role. Being data agnostic should be the goal. - Zendejas

The rest of the blog contains some good content around alert precision and alerting. If you put on a proverbial “onboarding at a new job” hat, this is a great introduction for folks entering the field or seeking a fresh look at fundamental concepts.

Threat Hunting based on Tor Exit Nodes (+ KQLs queries) by Sergio Albea

The Onion Routing (Tor) network is one of those funny cases of intention versus use. The idea behind it is ethically amazing: it helps mask the source of a connection to a destination server, and it would be particularly useful for people like political dissidents in hostile countries. But, whenever there is anything good, criminals tend to follow and exploit the goodness. Except crypto, all criminals! Just kidding.

In this post, Albea provides some excellent hypotheses and use cases for threat hunters to find machines on a network connecting to the Tor network. The first case is around the use of Tor locally to connect to Tor domains. This, in my opinion, is benign behavior for the most part, but it can raise legal and ethical concerns for a company, so your acceptable use policies should address it.

The second case is rooted in a more likely intrusion scenario. Attackers have used Tor to mask their source IP addresses and credential stuff login endpoints to prevent attribution and likely legal action. Although this makes sense from a privacy perspective, it’s terrible OPSEC in other ways. By design, the Tor Network publishes its exit node IP address list because, without it, Tor clients won’t know how to route through it. So, that makes an excellent detection mechanism to find abusive sign-in attempts from those routing their malicious traffic through Tor.

They provide several KQL examples so you can follow along with their hunting queries.

How Amazon uses AI agents to anticipate and counter cyber threats by Daniel Weiss

This research piece from Amazon showcases their Autonomous Threat Analyst (ATA) environment. If you take AI out of the equation, it’s a neat setup that I haven’t really seen in other corporate environments. They created a separate rule-testing environment that mimics their production environment, which is a feat in itself.

Now to add the AI parts back: they have a multi-agent architecture where a blue-team agent creates rules, validates rule logic by querying their mimicked environment, and performs curation and deployment. The fun part here is their red-team agent. They ran a query to generate Python reverse shells for detection validation, and it generated over 30. They fed telemetry from these reverse shells into the mimicked environment and identified detection gaps to improve their ruleset.

The beauty of LLMs for detection isn’t really about accuracy, but more about scale. What I worry about with this type of scale is its comfortable nature. Over thirty types of reverse shells seem like a great dataset, but were each one validated by an expert? Will LLMs generate obscure and distracting payloads to complete their task? If we only care about coverage at scale, will these LLMs waste time on these things instead of what we see in the environment?

These are all questions for which I don’t have a good answer. But, it may not matter in the sense that if we keep driving token costs down, then scale becomes irrelevant, even if the types of attacks are obscure.

Secondary Sponsor: runZero

Join runZero’s Holiday Hackstravaganza!

Tune into runZero Hour, a monthly webcast examining new exposures & attack surface anomalies. Join us on Dec 17 for 2025’s wildest vulns, top research picks, & 2026 predictions. Plus, trivia and Hak5 gift cards!
Register Now

☣️ Threat Landscape

⚡ Emerging Threats Spotlight: React2Shell

So the big threat landscape news in the last week was the React2Shell vulnerability. The exploit is elegant and simple, but the way the exploit chain leverages React’s processing capabilities is quite complex. Whenever 10/10 CVSS CVEs like this come out, the immediate thought is oh shit, another Log4Shell. It’s even worse when the researchers name the vulnerability something similar to Log4Shell, and this was no exception.

For those unfamiliar with React, it’s one of the biggest open-source frontend frameworks for arguably the most used programming language in the world, JavaScript. You can build highly responsive, complex, and beautiful applications and hook them into any backend framework of your choice.

The specific vulnerability is a server-side prototype pollution. Every object in JavaScript inherits the base prototype Object. So, when you build object primitives in JavaScript, everything from a User to a Window can use the Object’s properties. Here’s a basic example courtesy of Claude:

A person is an object with property: name. On line 6, you can call person.toString(), but person doesn’t have a toString method. That’s because all objects in JavaScript inherit Object by default, and as you can see from Line 15-19, it’ll continue “calling” up the Object chain until it reaches something it does inherit, such as toString!

This is where things get interesting for React2Shell. If you can control the input to a JavaScript function in React, such that you can supply or override functions, you can achieve arbitrary code execution. This is the premise behind React2Shell.

My colleagues at Datadog wrote about this in an excellent post detailing the vulnerability details:

The payload is from lines 4-15. The prototype pollution to override then on line 5. The actual malicious payload is under _prefix on line 10. This is a shell execution command so, if a vulnerable React server processes this specific payload, the server will call out to a shell and write the output of id to /tmp/pwned.

React’s vulnerable codepath processes HTTP POST requests with the `Next-Action` header and attempts to deserialize the payload as a React Server Component action. During deserialization, React splits references like $1:__proto__:then on colons and traverses the property chain, inadvertently accessing Object.prototype when it hits __proto__ and boom, Object is polluted!

Why is this such a big deal?

React2Shell had the right ingredients to make it a serious vulnerability with an industry-wide response. These ingredients included a CVSS 10 score with potential remote code execution, a PoC, a website, a reference to a patch to reverse-engineer, and some hype on social media. Organizations rushed to find exposure and a patch, and some accidentally took down their global CDN network in the process. There were exploitation attempts in the wild (Greynoise has a great writeup on this). My $dayjob saw our environments get hit hard once more PoCs started to drop.

The hard part here, as Kevin Beaumont points out, is the environmental context when deploying this version of React Server Components with the Next.js router. A lot of prerequisites were required, not for the exploit itself, but for the stack that needed to be deployed, which had the vulnerable code path. And if you didn’t have any of these web servers exposed to the Internet, the urgency factor of patching diminished.

But was there as much impact as Log4Shell?

The answer is a resounding no, but with a big asterisk*. Nothing compares to Log4Shell, as it truly was a black swan event in vulnerability land. But this is the problem with emerging news around vulnerabilities. We make comparisons to make sense of the chaos, and try to use that to inform urgency. So although this turned out to be mostly fine from an impact point of view, I believe we correctly placed the right amount of urgency to do something.

It’s a net positive for an industry that has a reputation for crying wolf over the smallest things. It means we are getting smarter at identifying the prerequisites for a black swan event and being okay with it not happening, because we still protected ourselves.

Firm handshakes to all who responded within the last week!

🔗 Open Source

Bert-JanP/KustoHawk

Powershell-based incident and triage platform for Azure environments. It uses the Microsoft Graph API to query for events related to Entra, Defender and Microsoft XDR. It has pre-baked queries so you can run investigations out of the box.

xorhex/BinYars

Binary Ninja plugin to run YARA-x rules inside a binja project. This is useful for reverse engineering workflows where you want to orient your understanding of the binary based on threat intelligence baked into YARA rules.

msanft/CVE-2025-55182

Fully contained PoC environment for React2Shell. The README also has a great explanation of the vulnerability and exploit chain.

qazbnm456/awesome-cve-poc

Yet another awesome-* list, but similar to the CVE-2025-55182 repository I linked above, contains references for all kinds of PoC code and environments for testing. I’ve found these most useful for when I need to capture telemetry and write rules in an environment that doesn’t mind getting exploited ;).

Detection Engineering Weekly
DEW #139 - Detection Surface, Frontier Models are good at SecOps & THREE YEAR ANNIVERSARY! 3 December 2025 at 14:03

DEW #139 - Detection Surface, Frontier Models are good at SecOps & THREE YEAR ANNIVERSARY!

Detection Engineering Weekly

By: Zack Allen

3 December 2025 at 14:03

Welcome to Issue #139 of Detection Engineering Weekly!

It’s crazy to think that it’s been three years of doing this newsletter.

Thank you all for making this a fantastic ride. Since I like stats and insights, here are some I pulled:

15,000 subscribers as of Monday :)
138 issues in total, so not perfect, 156 straight issues, 20 weeks of downtime sounds nice to me
Two kids, one major interstate move, one grad degree and no new tattoos, though I should commemorate this somehow and get a new one :)
At least one subscriber in all 50 states in the US. California, Texas, NY, Virginia and Florida are the top 5 most-subbed states
Subscribers from 153 countries across every continent. Substack doesn’t track Antarctica :(. US, India, UK, Canada & Australia are the top 5 most-subbed countries
If you like reading Ross Haleliuk, there’s a 30% chance you are also reading me. We have the top audience overlap! Eric Capuano, Jake Creps, Chris Hughes and Francis Odum are also fantastic newsletters with high overlap
I started sponsored ad placements in September and have been booked every week since then, and 2026 is looking even crazier

This Week’s Sponsor: root

Why Detection Teams Need Minute-Level Remediation
When CVE-2025-65018 dropped last week (libpng heap buffer overflow, CVSS 7.1-9.8), the exposure window started ticking. Attackers armed with AI can weaponize CVEs within hours. Traditional remediation workflows take 2-4 weeks: triage meetings, engineering scramble, testing delays.
But here’s what detection engineers need to know: the exposure window is where attackers win. The Root team patched the critical CVE in 42 minutes across three Debian releases (Bullseye, Bookworm, Trixie), creating a fundamentally different detection posture than the same CVE unpatched for weeks. Detection strategies must account for minute-level remediation capabilities.
Learn what CVE-2025-65018 teaches us about matching attackers at AI speed and why week-level remediation cycles leave detection teams with massive blind spots.
Full Story

💎 Detection Engineering Gem 💎

Turning Visibility Into Defense: Connecting the Attack Surface to the Detection Surface by Jon Schipp

I’ve been shilling the term “Attack Surface” with the detection team here at work. I think it’s a reasonable mental model to use when you need to focus detection efforts on your inventory and telemetry sources. So, when I read this post by Schipp, I was pleased to see a similar framing of the Attack Surface problem :).

The security industry has a good idea of what an attack surface is. It even has a product category vertical dedicated to it, but the definition becomes vague when you differentiate between internal and external attack surfaces. According to Schipp, the definition should focus on the assets you need to protect, which, in general, I agree with. There is no rule without telemetry, and it’s nearly a full-time job for detection engineers to identify, track, and ship the right telemetry so we can write detections.

Schipp takes this a step further with the concept of “detection surface”. The adversarial behavior you want to detect can only be detected in a subset of the assets that you own. He lists a few reasons why:

Do you have the right technology selected to generate the right telemetry and alerts on top of the assets you own?
Are you prioritizing the correct detections to find adversarial behavior in the assets you find the most critical?
How do you find new gaps in coverage, and are you doing the exercise enough as your attack surface grows?

These questions are why the 100% MITRE coverage meme exists in our space. You may write rules that cover 100% of ATT&CK, but are they detecting the right behavior given your environment? I’d much rather look at a MITRE ATT&CK heatmap with deep coverage in two tactics, like Exfiltration and Lateral Movement, so I know the team is really focusing on specific behaviors to catch.

If you want to see a visceral physical reaction from me, throw a print-out of an ATT&CK heatmap that’s all green. I’ll probably run away screaming.

🔬 State of the Art

Evaluating AI Agents in Security Operations Part 1 and Part 2 by Eddie Conk

~ Note, I had Part 1 ready to go for this week’s issue and Conk & the cotool team posted Part 2. It’s important to read Part 1 so you can understand my analysis for their follow-up blog! ~

I loved reading this post because it shows how detection-as-code evolves beyond your ruleset into AI agents that handle everything from rule triage to investigations. Cotool researchers performed a benchmarking analysis of frontier models (GPT-5, Claude Sonnet & Gemini) against Splunk’s Botsv3 dataset. Botsv3 is a security dataset containing millions of logs from real-world attacks, along with a series of questions in a CTF-like format for analysts to practice investigations.

Benchmark exercises like this answer more than “are these models accurately performing security tasks?” LLMs are cost-prohibitive, as in, they require financial capital to use the frontier model APIs, and human capital to shape, maintain, and verify results. AI agent efficacy is detection and investigation efficacy. Understanding ahead of time which agents perform well within the constraints of your business can accelerate decision-making.

Here are some of the results pasted from the blog:

The test harness for accuracy involved taking the individual CTF questions from Botsv3 and mapping them to investigative queries. Conk and team had to remove some bias from these questions because they were built as a progressive CTF. Basically, this means that answering one CTF question unlocked the next sequential question, and that sequential question could bias the investigation.

The latest frontier models from OpenAI and Anthropic outperformed Gemini here, but I was surprised to see 65% as a leading score.

Model investigative speed now enters the equation, and Anthropic’s Opus-4.5 beat the brakes off of every other model, including Haiku and Sonnet. This is good for teams who want to tune something to be fast and accurate, which seems like a good tradeoff, and it’s off to the races, right? Well, remember, detection efficacy means cost as much as it means accuracy, and the frontrunner, Opus-4.5, costs a little over $5 per investigation versus GPT-5.1’s $1.67.

There are a few other interesting callouts in the blog around token usage, but these three axes were the most relevant for people who need to balance accuracy, speed, and cost.

The detection community needs data like this to make cost-efficacy tradeoffs for their teams. Hopefully, we can see more studies comparing models, cost, and prompt strategies, and even better, releasing bootstrapping mechanisms to run these tests on our own.

OpenSourceMalware - Community Threat Database

This is a freely available threat intelligence database for reporting and tracking malicious open-source package malware. This is especially relevant for emerging threats, such as the Shai-Hulud attack, and it’s crazy to see how many packages are submitted nearly every day. If you sign in, you can view additional analysis details of the malware submitted by researchers.

Unfortunately, there are no direct IOCs on the page, so it’s hard to pivot to hashes if you want to download them from platforms like VirusTotal. It does link to sources like osv.dev , which sometimes contain hashes, but it’d be nice to see this platform host malware samples for download.

Revisiting the Idea of the “False Positive” by Joe Slowik

This oldie-but-goodie blog by Joe Slowik on the concept of false positives in security operations really drives home the underlying issues of the label. He first frames the idea of labels like true and false positives in terms of their origins in statistics. I wrote about these labels previously, and I tried to help readers understand that their value is directly proportional to the capacity of your security operations team.

Slowik goes in the other direction in terms of their value; instead of thinking about units of work, you should think about these labels in terms of the underlying behavior and hypothesis. Analysts talk about “true benigns” in this way. You alerted on the specific behavior you wanted to alert on, but you want to investigate further to determine whether it is malicious. This breaks the pure 1-shot application of a confusion matrix and adds more work for security analysts, since we need to question our underlying assumptions about a specific detection.

Recreated flow diagram from Slowik’s post

Challenging the hypothesis behind your detections aligns well with my discussion of security operations capacity versus efficacy. Here are a few questions I would ask you during this exercise:

Are you finding the right behaviors that could indicate maliciousness?
Are you okay with these behaviors generating true benign alerts, because the idea of a false negative with that behavior is detrimental?
Can the behavior you are looking for be enriched with environmental context, such as update cycles, peak traffic, or off-hours traffic?

The core of detection engineering is challenging assumptions. I hate the adage of “defenders have to be right every time, attackers have to be right once.” Finding a singular behavior to alert on across the attack chain gives us the advantage, so we really only need to be right once. So, as you build hypotheses and detection rules, you should balance what you want to see from a detection, even if it’s true benign behavior.

Intel to Detection Outcomes by Harrison Pomeroy

This is a nice introductory post to leveraging threat intelligence in detections.ai to generate detection outcomes. Full transparency: the platform has sponsored this newsletter, but it also has a community edition, so folks can sign up to benefit.

One of the hardest problems in cyber threat intelligence that I’ve dealt with for 15 years is proving tangible value. This is different than intangible value. The delivery of finished intelligence reports, RFIs, and investigative platform experiences can be considered intangible. You miss these things when you don’t have them, but it’s hard to measure the “why” behind the impact of a report or an RFI.

Detection engineering helps bridge this gap, specifically by enabling cyber threat intelligence teams to turn their research into tangible outcomes. This is what Pomeroy argues LLMs can do. You can feed an agent a cyber threat intelligence report, it can parse IOCs, TTPs, and log sources, and it can generate rules for you to try out and deploy to get up-to-date coverage of emerging threats.

Introducing LUMEN: Your EVTX Companion by Daniel Koifman

This is the release blogpost for Daniel Koifman’s LUMEN project, located at https://lumen.koifsec.me/. It’s a free tool for investigators and incident responders to load Windows evtx files for analysis. There are over 2,000 preloaded Sigma rules, and the entire analysis engine is run client-side. You can do several things once you load your logs in, such as running a sweep of the Sigma ruleset, building a dashboard on fired rules, building an attack timeline, and extracting IOCs. It has a feature to connect your favorite LLM platform to the tool using an API key and leveraging it for AI copilot capabilities.

☣️ Threat Landscape

Meet Rey, the Admin of ‘Scattered Lapsus$ Hunters’ by Brian Krebs

This is a classic Krebs doxing piece unveiling the identity of one of the main personas of The Com group, Scattered Lapsus$ Hunters. Rey was an administrator of one of the Com-aligned ransomware strains, ShinySp1d3r. It’s always crazy how he manages to pull the attribution thread to find these identities. An old message from Rey contained a joke screenshot of a scam email they received with a unique password. From there, he pivoted on the password to find more breach data tying Rey to a real person. Since Rey didn’t respond to him, Brian called his dad, and of course, Rey responded.

The Shai-Hulud 2.0 npm worm: analysis, and what you need to know by Christophe Tafani-Dereeper and Sebastian Obregoso

~ Note, I work at Datadog, and Christophe & Sebastian are my coworkers! ~

It’s rare to see the term worm inside a headline these days. It’s a rare label for a unique security phenomenon, and the idea still holds firm, this time targeting npm (again). The Datadog Security Research team put a lot of time and energy into their analysis of the latest Shai-Hulud wave. Some interesting notes from this campaign include using previous victims to post new victim data, a wiper component, and a clever local GitHub Actions persistence mechanism.

Inside the GitHub Infrastructure Powering North Korea’s Contagious Interview npm Attacks by Kirill Boychenko

Boychenko and the Socket Research team published their latest work on TTP updates to North Korea’s “Contagious Interview” campaign. It’s an impressive operation, given the scale they try to employ, aiming to conduct as many malicious interviews as possible. In this campaign, they tracked 100s of malicious packages, each with over 31,000 downloads. The factory-style setup of rolling new GitHub users with the malicious interview code, fake LinkedIn profiles, and rotating C2 servers is classic Contagious Interview.

Unmasking a new DPRK Front Company DredSoftLabs by Mees van Wickeren

To continue on the DPRK train, I found this post fascinating because it wasn’t about the malware associated with WageMole/Contagious Interview, but rather the techniques behind tracking infrastructure. Van Wickeren leveraged the reliable GitHub search engine to find malicious repositories linked to the campaign.

I was a little confused by their use of WageMole, only from a pure clustering nerd perspective. These look like Contagious Interview repositories, and the associated OSINT screenshots that call out some of them suggest that victims were taking malicious coding tests. WageMole, on the other hand, is a fake IT worker applying to companies.

At the end of the day it doesn’t matter too much because they all overlap, but its another demonstration of how hard it is to do attribution in this field.

🔗 Open Source

Koifman/LUMEN

Full LUMEN web-app from Daniel Koifman’s blog in State of the Art above. You can host your own LUMEN instance without ever leaving your localhost!

Vyntral/god-eye

Subdomain and attack surface enumeration tool that leverages local Ollama for AI analysis on top. It’ll connect to twenty different open-source scanning and directory services, like dnsdumpster, then push results into the local Ollama model. It looks intelligent enough to help with HTTP probing, CVE analysis, and sifting through Javascript code for anything leaked or vulnerable to standard web attacks.

R3DRUN3/magnet

Magnet leverages the GitHub API and specific query strings to find potential secrets posted to public repositories. You can specify strings or use ones provided by magnet. In their PoC, R3DRUN3 managed to find two repositories with leaked tokens, then responsibly reached out to them to provide remediation steps, and they responded.

ChiefGyk3D/pfsense-siem-stack

SIEM-in-a-box for pfSense firewalls. It has an impressive architecture: OpenSearch backend, parsers in Logstash and uses Grafana/InfluxDB for metrics. It looks like they’ll be extended the SIEM backend to other open-source SIEMs like Wazuh in the future.

RazviOverflow/advent-of-hacks

Awesome-* style list of hacking challenges for the holiday season. So far they have 8 listed, so if you wanted to spend some time this December to up your hacking and CTF knowledge you have your work cut out for you!

Detection Engineering Weekly
DEW #138 - Sigma's Detection Quality Pipeline, Anthropic finds AI-first APT & eBPF shenanigans 19 November 2025 at 14:03

DEW #138 - Sigma's Detection Quality Pipeline, Anthropic finds AI-first APT & eBPF shenanigans

Detection Engineering Weekly

By: Zack Allen

19 November 2025 at 14:03

Welcome to Issue #138 of Detection Engineering Weekly!

✍️ Musings from the life of Zack:

I switched to the Brave browser, and I don’t think I’m ever looking back
My coworker suggested I go to a Tottenham Hotspur match while I’m in London. I’m a fan of one of the most insane fanbases in the NFL, where we jump through folding tables set aflame before games, and I feel that same energy from the Spurs YouTube shorts I’m watching during my research
I fractured my rib 5 weeks ago and I’m finally back (carefully) training. It feels good to move again!

This Week’s Sponsor: Sublime Security

Tomorrow: Intro to MQL, Threat Hunting, and Detection in Sublime
We invite Detection Engineering Weekly subscribers to join a technical webinar that will guide you through how Sublime Security detects advanced email threats. Learn how MQL (Sublime’s native detection language), threat-hunting workflows, Lists, Rules, Actions, and Automations all contribute to a flexible detection pipeline.
Additionally, discover how our Autonomous Security Analyst (ASA) accelerates investigations.
Register today!

💎 Detection Engineering Gem 💎

SigmaHQ Quality Assurance Pipeline by Nasreddine Bencherchali

Many people claim to use detection-as-code, but I rarely see these pipelines discussed as transparently as those from SigmaHQ. In this post, Nasreddine provides readers with a complete overview of how Sigma’s community ruleset repository manages community contributions. Documentation is essential here: the Sigma team ensures that every community rule adheres to a specification, so they all appear the same, even down to the filename. Here’s their Linux rule specification:

I love the attention to detail here. When you have a ruleset of thousands of rules, you need to ensure consistency in every step of the detection engineering process. It may not matter to have these conventions when you are a single team managing dozens of rules, but when you are a five-person team managing 1000s, it makes the ruleset more attractive for others to use and also keeps you sane.

The coolest part here, IMHO, is the combination of benign and malicious log validation tests. Each rule in each pull request undergoes several validators, followed by a good-log test and regression testing. The good-log test takes candidate rules and runs them across the evtx-baseline repository. If a rule generates an alert, then it must be a false positive, and the pipeline fails.

Separately, the regression testing pipeline ensures that a change in the rules doesn’t introduce any regressions that could cause false negatives and forces submitters to contribute a sample of a malicious log to validate its usefulness. The maintainers may also request reference links to blogs, threat intelligence websites such as VirusTotal, and even malware sandboxes to ensure they understand the efficacy of the rule before merging.

🔬 State of the Art

Stopping kill signals against your eBPF programs by Neil Naveen

This post is an excellent study in the cat-and-mouse game of threat detection on Linux systems. For the most part, eBPF-style security agents are the de facto standard for telemetry inspectability and detection & response. We’ve seen a lot of research in this newsletter on how effective threat actors on Windows spend time trying to disable EDRs to go unnoticed during their operations. But, I have seen few, if any, research on how to protect against eBPF attacks on Linux until I read Naveen’s research here.

When you want to terminate an eBPF agent, you’ll need Administrator privileges to do so, as they run as Linux daemons. If someone did manage to get permissions, you could send a kill signal to the process and then Bob’s your uncle. But what if you wanted to add extra steps to collect even more telemetry and find a compromise? Naveen came up with two options:

Using eBPF to hook kill and never let anything kill it
Leveraging cryptographically signed nonces as an added layer of assurance to accept a kill signal, and to keep your sanity because you just locked yourself out from restarting the agent

I’ve been doing Linux development, both offensively and defensively, for over a decade. This is probably the first time I’ve seen a clever application of cryptography to give a defense-in-depth approach to Linux detection & response. Here’s Naveen’s workflow comparing and contrasting a standard public-private key setup to a nonce-based signature kill methodology:

Example signature flow from Naveen’s post

Of course, actors can also do fun stuff where they attack the Network stack directly and prevent the agent from reaching out to your security vendor’s domain for additional alerting.

Technique Research Reports: Capturing and Sharing Threat Research by Andrew VanVleet

This post serves as a follow-up to VanVleet’s research into detection data models (DDMs). DDMs are a form of documentation for detection engineers to help transcribe knowledge from an attack technique into actionable detection opportunities. But, there’s always more to a detection rule than the specific telemetry it’s trying to capture. This is where VanVleet introduces Technique Research Reports (TRRs).

The idea behind these reports is to capture the research knowledge surrounding the technique and rule. This is probably the most challenging part of our jobs, because individual research methodologies vary, and you may be an expert in a specific attack surface or style of attack, but it doesn’t do your team any favors if you can’t help them learn how you arrived to a rule. It’s even worse if you leave the team, and folks are left trying to understand the specifics of the attack, as well as the environmental context and the research you’ve performed.

I do see a lot of similarity with MITRE ATT&CK’s recent v18 launch, specifically Detection Strategies. “Identify possible telemetry” is, in general, where Detection Strategies stop and TRR reports begin. Log sources are environment-specific, and although you may have Sysmon, EDR, or syslog logs, they can become nuanced based on your environment setup. For example, a CrowdStrike vs. SentinelOne query will affect your log source query.

They are incredibly comprehensive write-ups, or “lossless” research reports, as VanVleet calls them. For example, the TRR for DCShadow attacks is a fantastic resource for detection engineers to understand the intricacies of a Rogue DC attack. It can be a blog post in its own right. However, this is where the tradeoff between documentation quality and the velocity of maintaining a ruleset comes into play.

I love this research, but given how much valuable time he invested in it, it may not be conducive to productivity unless your leadership time allows you to do so. I also worry about drift in techniques and telemetry sources, which can make some of these outdated. LLMs could help solve some of this because they are generally very good at parsing and maintaining knowledge bases.

Weird Is Wonderful by Matthew Stevens

This is a short-but-sweet commentary on the role of detection engineers and how we need to “catch the weird.” It’s always nice for me to see fresh takes on concepts I’ve talked and read about for years. When folks try to break into this industry, they are sometimes bombarded with extremely technical concepts, complex environments, and a wide array of technologies they must learn before they feel useful. But, sometimes, it’s nice to hear from others who can distill complicated subjects into easy-to-understand concepts.

Catching weird, to me, is the idea that we all succeed at our jobs when we can distinguish normal from malicious. Weird may not be malicious, so having some intuition around things that look off can help solidify the baseline of normal in your environment versus something not normal. It’s a professional paranoia, of sorts :).

Be KVM, Do Fraud by Grumpy Goose Labs / wav3

This is a follow-up post to Grumpy Goose Labs’ research on hunting for KVM switches to detect fraudulent employees. It’s full of Kim Jong-un memes, but there are excellent technical details around detecting KVM switches in your environment. The author, wav3, uses CrowdStrike as their example, and managed to dump a bunch of information on how to hunt indicators ranging from KVMs, Display settings and product indicators so you can see who among your workforce may employ some of these risky devices.

☣️ Threat Landscape

⚡ Emerging Threats Spotlight: Anthropic Disrupts First AI-Orchestrated Cyber Espionage Campaign

Disrupting the first reported AI-orchestrated cyber espionage campaign by Anthropic

Last week, the threat intelligence team at Anthropic disclosed the disruption of the “first-ever” AI-orchestrated espionage campaign by a Chinese Nexus threat actor. GTG-1002 is the designation for this threat cluster, and they attributed with high confidence to a Chinese state-sponsored operation. In this summary, I’ll break down the architecture and Anthropic’s analysis of the attack workflow, share my commentary on the parts of the report that I like and dislike, my medium-high confidence analysis of details missing from the report, and provide takeaways for detection engineers.

Attack Architecture

The most interesting aspect of this operation is that Anthropic had visibility into the orchestration layer of the threat activity, leveraging a combination of Claude and several MCP servers. They claim the threat group automated 80-90% of their operations autonomously, an impressive feat when you consider that this is a nation-state operation. GTG-1002 managed to jailbreak Claude into thinking it was talking to a red teamer, allowing them to instruct Claude to work on their behalf.

If you had told me last year that a nation-state would trust an AI system to execute its campaigns against victims, I would have (rudely) laughed in your face. But it looks pretty slick:

Architecture diagram pulled from the Anthropic report.

For those with a Model Context Protocol (MCP) server, it provides a standardized way to connect a human interface, such as chat or code editors, to external tools like APIs. AI applications like Claude can only use a small set of tools, so writing your own connectors to centralize your chat interface to whatever toolset you want is a powerful feature of these platforms.

According to Anthropic, GTG-1002 built a suite of MCP servers that connected to several open-source toolsets dedicated to performing reconnaissance and fingerprinting, exploitation, post-compromise lateral movement and discovery, and eventually, collection and exfiltration. This is the impressive part of the operation: imagine an operator leveraging a chat interface to create a scalable infrastructure for red team operations, with the “backend” attack tool system handled by Claude and capable of scaling as needed.

The team claims that with their visibility in Claude usage, the operators automated 80% to 90% of their attacks. The remaining 10%-20% involved human verification at the “Report & Analysis” step, as shown in the diagram above.

Attack Flow

Anthropic grouped their attack operations into five phases, as shown above. The “robot” in each phase serves as the MCP server, directing specific tools to perform tasks along the ATT&CK killchain. The human icon next to the robots indicates a manual validation step by a human. These pit stops serve as a verification step to make sure that Claude is behaving correctly and not hallucinating.

In the report, the validation steps did result in a myriad of hallucinations. They claim Claude returned incorrect results, non-existent credentials, and the wrong IP addresses. So, although the attack flow diagram shows a clean, step-by-step process for the attack phase, these operations were frequently rerun.

Pros & Cons

This report has received criticism from the security community since its publication. To me, it’s a landmark report and whether it’s a famous or infamous report, it has left it mark. I want to list both what I like and don’t like about it.

What I like:

There’s an excellent demonstration of the unique visibility the Anthropic team has over attack infrastructure. It’s certainly a threat intelligence source that we can derive useful insights from, and foundational model companies like Anthropic and OpenAI can provide that
There is a specific call out around responsible disclosure to victim organizations. It shows the good intentions of the security team at Anthropic, and I hope to see more of that in the future
They admit shortcomings around how the actors performed jailbreaking to get Claude Code to help them with their operations, as well as limitations in hallucinations
The transparent technical context around the threat model of AI Trust was helpful to see and understand their day-to-day challenges

What I didn’t like:

They did not provide any indicators of compromise. No IPs, domains, hashes, signatures, or payload examples. It’s hard for research teams to verify findings independently.
The attribution is vague, and it reads like Anthropic intentionally redacted proof around this activity. Indicators of compromise could help with this
It reads as if these attacks were cloud-based instead of on-premise. I couldn’t parse out if this was differentiated, but it doesn't matter when it comes to the severity of a Chinese-nexus APT cluster. The callout about attacks against databases, internal applications, and container registries makes me think this is a cloud environment

Overall, the report provides a net benefit to security teams on several fronts. The claim of an APT using modern AI architecture from Anthropic, rather than vendor marketing, is a step forward in our understanding of an evolving threat landscape. It builds trust in Anthropic’s security team, which is one of the most used platforms for foundational models today. If we got this report from another vendor, we’d question the efficacy of their security program.

I think the feedback is valid regarding the value of threat intelligence, but I only see them improving from here.

🔗 Open Source

tired-labs/techniques

Technique Research Report dataset from VanVleet’s work above. It has extensive documentation of several attack techniques, and they fit the style-guide he talked about in his blog. It also includes a link to a frontend searchable library for those who don’t want to navigate the GitHub repository.

ricardojoserf/SAMDump

Volume Shadow Copy technique leveraging internal Windows APIs versus the command line. When you run the binary, it won’t generate any traditional Sysmon telemetry leveraging vssadmin.exe, which arguably makes it harder to detect. It has a few other tricks, including using NT API and avoids GetProcAddress usage.

reconurge/flowsint

Open-source and graph-based OSINT tool that looks like a more modern take on Maltego. It has dozens of transforms, so you can get a good amount of functionality out of it to compete with Maltego. The differentiation here would be hosting something on your own, and if you require specific integrations, you’d have to build them yourself.

RootUp/git-fsmonitor

This is a fun initial access technique leveraging the fsmonitor capability of git clients. You edit the git configuration file and set the fsmonitor value to a shell script. When git is run, the shell script executes under the hood.

Detection Engineering Weekly
DEW #137 - AI Agents For Security By Security, Free Sigma training & JA4 for beginners 12 November 2025 at 14:28

DEW #137 - AI Agents For Security By Security, Free Sigma training & JA4 for beginners

Detection Engineering Weekly

By: Zack Allen

12 November 2025 at 14:28

Welcome to Issue #137 of Detection Engineering Weekly!

✍️ Musings from the life of Zack in the last week:

I was in LA for a wedding and went to Venice Beach for the first time. It was awesome seeing pros at the skatepark, jamskaters, live music, and of course, this ^^ MF DOOM mural
Speaking of LA, there are Waymos EVERYWHERE
It started snowing here in New England, and we celebrated by running outside barefoot for as long as my family could bare it

This Week’s Sponsor: Nebulock

Trust Your Intuition. Vibe Hunt for Outcomes.
Good hunters feel suspicious activity before the alert ever hits. Vibe Hunting allows you to lean into that intuition and combine it with machine reasoning to hunt across data and telemetry without juggling tools. Nebulock’s threat hunting agents connect the dots, explain reasoning, and deliver contextual recommendations.
Hunting becomes less about process and more about bridging hypotheses with detection.
Start Vibe Hunting

💎 Detection Engineering Gem 💎

How Google Does It: Building AI agents for cybersecurity and defense by Anton Chuvakin and Dominik Swierad

I typically avoid including blogs from vendors that are high level concepts around complicated topics like security and AI. But, this blog struck a great balance between how they approached internal Google security engineers who were skeptic of leveraging AI in their day-to-day work. I think this approach can be copied for any security organization looking to augment their security operations with LLMs, as it focuses on small achievable wins grounded in risk reduction and reality versus “thinking big.”

Chuvakin and Swierad split this approach up into four steps:

Hands-on learning builds trust: You wouldn’t want to purchase a SIEM without having your Detection & Response team understand how to use it, so why do the same thing with agentic systems?
Prioritize real problems, not just possibilities: Ground your agentic problems in a space where you are already familiar with the problems. They list two prime examples every D&R engineer could use to help with: analyzing large swaths of security data into insights, and quickly triaging malicious code to understand its function
Measure, evaluate, and iterate to scale sucessfully: This section uses the dirty word/acronym “KPI” (cringes in business school). Instead, they gut-check success by asking two critical questions: “Did this meaningfully reduce risk?” and “What amount of repetitive tasks did this automate and free up capacity?”
Get your foundations right: This is the most nuanced section that carries the most value for folks to steal. When you develop agentic systems, stick to simplicity on the particular task you need the agent to do. Agents aren’t security engineers, they are containerized experts in a small subset of tasks. Ensure they are proficient in these tasks, because what makes them powerful is how you connect them together.

The way I see this working for years to come is that we’ll have agentic workflows handle the “80%” work, such as repetitive tasks or analysis. The “20%” work that requires a ton of focus will be traditional expert work that we know and love. This split still requires us to have deep expertise in our field, but I worry about the value of learning from the more boring or tedious work.

🔬 State of the Art

Detection Stream Sigma Training Playground by Kostas Tsialemis

Tsialemis, a long-time contributor to the detection engineering research space and a multi-time featured author on this newsletter, just published a free Sigma training playground for detection engineers. His associated blog post goes over the platform in detail, but it’s like a CTF for writing rules. There are some cool features which include interactive challenges, responsive feedback to the challenges, and the ability to write your own challenges and contribute them to the community.

A leaderboard always motivates me, too. #8 as of 10 November!

Mistrusted Advisor: Evading Detection with Public S3 Buckets and Potential Data Exfiltration in AWS by Jason Kao

Trusted Advisor is a free service from AWS that helps scan customer infrastructure for misconfigured security and resilience resources. One resource it helps find misconfigurations for is in S3 buckets, which have led to massive security incidents and breaches like those at Capital One and Twitch. So, if you can find a 0-day bypass to a security system like this, it can give an attacker the ability to evade defenses in your cloud accounts. And it appears that is what Kao and the Fog Security team did.

The basic premise behind this attack is setting an insecure policy that would generate an alert from Trusted Advisor, but explicitly denies three actions Trusted Advisor uses for the check.

So the insecure policy statement are lines 4-10, while the bypass occurs in a separate statement on lines 11-17. As it turns out, even AWS can get IAM wrong! Basically, the check failed close here and reported nothing was wrong, where the behavior should be failed open in cases where it can’t receive the telemetry to make an assessment.

The team submitted the security disclosure to AWS, and they fixed it after two tries. It also looks like Fog Security wasn’t happy with how AWS’ publicly disclosed the issue, as it contained an inaccuracy in a non-existent action that the hyperscaler fixed.

All you need to know about JA3 & JA4 Fingerprints (and how to collect them) by Gabriel Alves

This piece is an easy-to-understand introduction to the powerful TLS fingerprinting algorithms, JA3 & JA4. With TLS everywhere, the underlying Application Layer traffic has become much harder to analyze for potential security indicators. You could set up TLS termination, but there’s a large cost associated with building that infrastructure, and decrypting and inspecting traffic also leads to compliance issues.

The JA* algorithms solve this by building fingerprints of the unique characteristics of TLS handshakes. Virtually every implementation of TLS in code has its own quirks and intricacies that make it unique. When you add more infrastructure on top of that, it can be a powerful tool to cluster traffic in ways to identify malware families, hosting infrastructure or bots.

Alves provides readers with some great visuals to understand these unique fingerprints and utilizes the most powerful security tool in existence, Wireshark, to do so.

Agentic Detection Creation: From Sigma to Splunk Rules (or any platform) by Burak Karaduman

I’m seeing more blog posts leveraging agentic workflow platforms to build detection content, and I’m all for it. At this point in our journey in detection engineering, I don’t see why you wouldn’t have agentic rule writing to assist you. Here’s why:

MITRE ATT&CK serves as a rich knowledge base of tradecraft references that we all fundamentally agree is the standard
Telemetry sources are well documented, and the startup cost of booting up an environment for testing is decreasing more and more
Threat intelligence companies and blogs help piece together attack chains that you can generalize
Sigma serves as a universal language that forces rule content structure and documentation, and has a rich library of converters to your SIEM of choice
Detection as code pipelines serve as a quality gate for human review and for testing
SIEM APIs have capabilities to ingest a candidate rule and make sure it’s valid in its native language

Karaduman’s approach here follows the pattern I listed above, and it’s functionally sound. It follows a lot of the fundamentals of the detection engineering lifecycle. The agents take ideation as an input, and continuously research, design, and validate candidate rules. Once the Sigma rule is created, Karaduman leverages sigconverter.io to translate the rule into SPL and has a separate SPL validation agent to make sure it can run in production.

It’s a clever setup with several “smaller” agents performing tasks, which looks to be the optimal setup for this agent-to-agent workflow. I’m impressed at the simplicity of their architecture, and they were kind enough to include the fully visualized n8n workflow for readers to experiment with.

Can you guess what the most crucial step is here? The red box of course! It compiles every piece of documentation in the rule, validates it against Claude’s Sonnet 4.5 model, generates a report and messages the hypothetical detection engineer in email and on Teams.

☣️ Threat Landscape

GTIG AI Threat Tracker: Advances in Threat Actor Usage of AI Tools by Google Threat Intelligence Group

Unlike the cyberslop post from last week, where researchers at MIT made some bold claims on AI usage by ransomware operators, Google’s intelligence group brings the receipts on threat actor usage of LLM tools during operations.

I quite like the coining of “just-in-time” malware leveraged by two families they track as PROMPTFLUX and PROMPTSTEAL. These both generate malicious code on demand, and it looks like a multi-agent step that creates the code and obfuscates it during malware execution.

U.S. Nationals Indicted for BlackCat Ransomware Attacks on Healthcare Organizations by Steve Alder

Two American security professionals were indicted for allegedly working as initial access brokers for BlackCat ransomware. This is a wild story: they both worked for a threat intelligence company named DigitalMint, conducting RANSOMWARE NEGOTIATIONS on behalf of victims. Talk about insider threat, right?

In a classic case of insider threat motives, the main conspirator was in debt and went into business with BlackCat to help relieve that debt. This is a common tactic employed by spy agencies, so, logically, it would also work for criminal gangs.

Ex-L3Harris Cyber Boss Pleads Guilty to Selling Trade Secrets to Russian Firm by Kim Zetter

Is it insider threat week? It feels like insider threat week. Zetter reports of a man who was arrested and found guilty via a plea deal for selling trade secrets to an “unnamed Russian software broker”. The accused worked for L3Harris Trenchant, a U.S.-based developer of zero-day and exploitation tools, and earned over seven figures in the process.

Interview with the Chollima V by Mauro Eldritch, Ulises, and Sofia Grimaldo

This series by the Bitso Quetzal team highlights their research (and shenanigans) with live interviewing DPRK IT Workers. The interesting part of this interview, and potentially a change in WageMole's TTPs, is that they are interviewing and recruiting collaborators to conduct interviews on behalf of WageMole. There were early reports of this happening, but Grimaldo, Ulises, and Eldritch brought receipts in the form of chat logs, Zoom screenshots, and LinkedIn profiles.

LANDFALL: New Commercial-Grade Android Spyware in Exploit Chain Targeting Samsung Devices by Unit 42

LANDFALL is a Samsung Android-based spyware family discovered by Unit 42 researchers. They found this family while hunting for exploit chains related to the DNG processing exploit that Apple disclosed earlier this year. DNG is a file format that both Android and iOS can process, and it’s within this processing logic that the vulnerability and subsequent exploit chain exist.

It’s pretty neat how the Unit 42 team came across this malicious file: they were hunting for DNGs to replicate the iOS exploit and found one that had a Zip file appended to it, but was exploiting Samsung’s recently patched vulnerability from earlier this year. The team pulled apart the malicious DNG, found two .so files and mapped out the command and control network associated with it.

🔗 Open Source

OSINTI4L/Paper-Pusher

A Bash script for sending spam to WiFi-connected printers over LAN.

😭😭😭

karlvbiron/MAD-CAT

MAD-CAT is a chaos engineering tool that implements data wiping and corruption attacks against databases to simulate database failures and data wiping-style attacks for detection engineers. It supports six database technologies: MongoDB, Elasticsearch, Cassandra, Redis, CouchDB, and Apache Hadoop.

FoxIO-LLC/ja4

JA4 TLS fingerprinting library referenced in Alves’ post above. I’ve linked JA4 before, but it’s a seriously effective tool to add to detection arsenals, especially if you can instrument it in publicly accessible servers.

EvilBytecode/NoMoreStealers

A Windows minifilter driver that blocks filesystem access to specific file paths to prevent infostealers. The hardcoded paths it protects include browser secret data, cryptocurrency wallets and secrets, and chat applications.

Idov31/EtwLeakKernel

Event Tracing for Windows (ETW) consumer that requests stack traces to leak Kernel addresses. This can help with exploit development if you need to exploit a Kernel vulnerability and require base addresses, potentially defeating ASLR.

Detection Engineering Weekly
DEW #136 - ATT&CK V18 deep dive, Cyberslop @ MIT & Aisuru repurposes to residential proxies 5 November 2025 at 14:03

DEW #136 - ATT&CK V18 deep dive, Cyberslop @ MIT & Aisuru repurposes to residential proxies

Detection Engineering Weekly

By: Zack Allen

5 November 2025 at 14:03

Welcome to Issue #136 of Detection Engineering Weekly!

✍️ Musings from the life of Zack in the last week:

I’m trying something different here and performing a deeper analysis on content where I think it matters for y’all. It won’t happen often, but whether it’s a Gem or a piece of Threat Landscape news, I want to give you all my take beyond what you normally see, especially if it’s a story I’m particularly passionate about!
I just hit my 4-year anniversary at Datadog, so time is flying by. My 3-year anniversary for the newsletter is in a few weeks and it feels wild thinking about doing this for 36 months.
I stole every adult-sized candy bar from my kids at Halloween, and I didn’t think twice about it.

This Week’s Sponsor: Hack The Box

Your Tools Don’t Defend. Your People Do.
Threats evolve faster than your tech stack. Hack The Box keeps your teams ahead of attackers with hands-on, continuous upskilling that powers real Continuous Threat Exposure Management (CTEM).
Equip your people with the skills to validate, prioritize, and respond effectively and build the true resilience that keeps your organization ready for whatever comes next.
Get Your Team Started

💎 Detection Engineering Gem 💎

ATT&CK v18: The Detection Overhaul You’ve Been Waiting For by Amy L. Robertson

New ATT&CK version drops always deserve a feature in this newsletter, and I’m very pleased to see the changes in v18!

There are several techniques and procedures added to the ATT&CK arsenal, but I’d like to focus my analysis on the usefulness behind Detection Strategies for detection ideation and tuning.

Detection Strategies

The new version shipped a large change in how ATT&CK approaches detections via Detection Strategies. I wrote about this in Issue 121, but the common gap with ATT&CK is linking a technique or procedure to detection guidance. Through the use of STIX Domain Objects, defenders can now leverage these detection opportunities via machine-readable data, rather than relying on freeform text. Here’s an example leveraging Scheduled Task/Job Abuse:

I used Linux as an example here. You have three data components associated with finding scheduled job attacks. Each of these components has a log source name and channel. So, for line 6 (DC0061), you can use auditd syscall monitoring and look for writes and renames of cron files. The mutable elements part helps with detection tuning, and this can be everything from frequency analysis to environmental context, such as unusual users scheduling jobs.

Enterprise Updates & ESXi Detection Strategy Example

The team added several new tactics, and there seems to be a big push on cloud-native technologies. For example, adding the Container CLI or API (in the case of Kubernetes) is a great step to capturing how threat actors are moving away from on-prem technologies but using similar techniques to move through the kill-chain.

Local Storage Discovery, for example, highlights typical discovery tradecraft for finding interesting volumes on a victim machine. But there’s nuance here with whether you are on a cloud server, Windows host, or a Hypervisor. Looking at the Detection Strategy DET0188, a detection engineer can switch between Analytics platforms and perform their own testing based on the data components and channels. Now let’s work through tuning, and I’ll pick on this Sigma rule, ESXi Storage Information Discovery Via ESXCLI.

Nas’ and Maurugeon’s rule successfully implements the Data Component → Name → Channel analytic, but the rule may be broader (high recall) and requires tuning. If you study the Mutable Elements table, you can scope this rule down to restricting alerts based on ssh_source_ip being from outside your perimeter, or by tuning the esxcli_command_scope. Let’s tune via the command scope.

Reading the developer portal for esxcli, and with a bit of help from Claude, the command scope namespace looks like the following:

Lines 40-42 could be potential tuning updates to the Sigma rule to make it more precise. This would obviously need some testing, but moving from Analytic → Sigma Rule → ESXi command line documentation (thanks, Claude) to tuning was much easier.

For a deep dive into this type of detection research, check out Nathan Burns’ blog on the topic, which I posted in Issue 100 as a Gem.

Why is this important?

In my example, I walked through a tuning opportunity for ESXi. I’m not an ESXi expert, but I have good knowledge of Linux threat detection and MITRE ATT&CK. The Detection Strategy quickly oriented me to understand core detection opportunities, but also provided tuning ideas for broad to precise esxcli commands to alert on. Additionally, it took it a step further with SSH source IP environment hardening.

The ATT&CK knowledge base can now serve more than just a reference table for techniques. You can dive into each technique, get relevant examples for threat actors, and it points you to strategies with specific data sources and channels to alert on. It cuts down the time I would spend on Googling or setting up environments to smash my head on the keyboard until I get the right logging configuration to generate the alert telemetry.

☣️ Threat Landscape

CyberSlop — meet the new threat actor, MIT and Safe Security by Kevin Beaumont

This new series by Kevin Beaumont revolves around a new term he coined, “CyberSlop.” The definition I’ve gleaned from his writing is taking traditional FUD marketing techniques in cybersecurity and leveraging trusted institutions (like MIT in this story) to make AI-threat claims even more credible, especially through research papers and blog posts that lack evidence.

The story in this first edition revolves around a bold claim by MIT researchers in a paper that 80% of ransomware gangs use AI in their operations. After digging into the paper and publicly calling it out, it disappeared from the MIT website. Two of the authors are from Safe Security, a cybersecurity startup. As it turns out, the principal MIT researcher is on their board, with no disclosure of this conflict of interest in the paper.

Aisuru Botnet Shifts from DDoS to Residential Proxies by Brian Krebs

DDoS-for-hire botnets don’t pay enough to criminals who run them. At the end of the day, it's an inconvenience that sites suffer, and the Googles and Cloudflares of the world have gotten so good at soaking traffic, making me think they are even more irrelevant than before.

Residential proxies, on the other hand, are where money CAN be made. And this piece on the Aisuru botnet, a DDoS-for-hire botnet turned into residential proxy provider, is a good breakdown of these intricacies. In this post, Krebs exposes a web of proxy services, parent companies and the grayhat style recruitment they have of unsuspecting devices to build their new-age botnet.

Ukrainian National Extradited from Ireland in Connection with Conti Ransomware by U.S. Department of Justice

The U.S. DoJ extradited a suspected Conti member residing in Ireland. Lytvynenko was first arrested in 2023 at the request of the FBI, and has been facing extradition proceedings since then. There are some wild numbers cited in this report, which highlight the prolific nature of Conti. Lytvynenko is accused of extorting $150 million in ransomware payments from Conti victims alone.

SesameOp: Novel backdoor uses OpenAI Assistants API for command and control by Microsoft Incident Response

This is the first threat report I’ve read where a threat group leverages OpenAI as a C2 channel. SesameOp is the name of a new malware family by Microsoft Incident Response which uses OpenAI’s now-deprecated and slated-for-removal next year Assistant API. The malicious DLL queries the Assistant API vector store to find infected hostnames and then leverages the Assistant’s description field to execute a command.

The vector store part here is interesting because I imagine it makes detecting abuse much more challenging for security teams at OpenAI. You can typically scan platforms for victim or malicious domains, but do you now need to scan every vector store for the same thing?

A new breed of analyzers by Daniel Stenberg

Stenberg, the creator and head maintainer of cURL, triages and patches numerous security vulnerability submissions. In the before AI times, these submissions were (mostly) done by humans with some level of automated slop from fuzzers. Since then, a large number of LLM-generated slop submissions have burdened the cURL team.

It was cool seeing this update almost as a Part 2 of the post I linked. AI-backed vulnerability discovery and submission platforms are getting much better, especially those that have venture capital behind them, rather than a “researcher” running some LLM locally to find security weaknesses.

🔗 Open Source

kas-sec/version.dll-sideloading

Neat proof of concept abusing OneDrive.exe and DLL sideloading to gain execution in the OneDrive process. Once it gains execution, the malware registers exception hooks via Vectored Exception Handling (VEH) to bypass EDR detection. The registered exception handler hopefully avoids being hooked by the EDR process so you can evade detection.

center-for-threat-informed-defense/attack-workbench-frontend

ATT&CK’s frontend application that serves as a self-hosted knowledge base for detection engineers and the ATT&CK library. With the latest v18 release, you’ll see additional resources leveraging Detection Strategies.

loosehose/SilentButDeadly

EDR killer technique that leverages the Windows Filtering Platform to prevent EDR agents from phoning home to cloud infrastructure. Super useful for preventing alerts from being sent to the cloud, but could still be noisy as an EDR evasion technique.

zopefoundation/RestrictedPython

Sandbox-like Python runtime execution environment for running untrusted code. It’s not a sandbox like a virtual machine, but it’s a subset of the Python language that restricts risky primitives in Python that can be used maliciously.

malwarekid/OnlyShell

Go-based reverse shell handler that integrates several types of reverse shells into one interface. So if you have a bash reverse shell and a PowerShell cmdlet reaching out, it will automatically detect the environment and shell type so you can select via its TUI-like interface.

Detection Engineering Weekly
DEW #135 - Chaos Detection Engineering, Connecting Policy to IR playbooks & Spooky AWS Policies 29 October 2025 at 13:03

DEW #135 - Chaos Detection Engineering, Connecting Policy to IR playbooks & Spooky AWS Policies

Detection Engineering Weekly

By: Zack Allen

29 October 2025 at 13:03

Welcome to Issue #135 of Detection Engineering Weekly!

✍️ Musings from the life of Zack in the last week

I’m helping host the second edition of Datadog Detect tomorrow! We have an excellent lineup with folks I’ve featured several times on this newsletter. It’s fully free, fully online, and also available on-demand. We have a small capture the flag afterward to win some socks.
- 👉 Register Here 👈 and don’t forget to meme out in the webinar chat like last time.
- We had close to 1000 chatters so it felt like a Twitch stream
I’m all booked for London and got some excellent pub and restaurant recommendations. Please keep them coming :D

This Week’s Sponsor: detections.ai

Community Inspired. AI Enhanced. Better Detections.
detections.ai uses AI to transform threat intel into detection rules across any security platform. Join 9,000 detection engineers leveraging AI-powered detection engineering to stay ahead of attackers.
Our AI analyzes the latest CTI to create rules in SIGMA, SPL, YARA-L, KQL, and YARA and translates them into more languages. Community rules for PowerShell execution, lateral movement, service installations, and hundreds of threat scenarios.
Join @ detections.ai
Use invite code “DEW” to get started

💎 Detection Engineering Gem 💎

How to use chaos engineering in incident response by Kevin Low

Hey look, security steals SRE concepts again and it’s a beautiful thing! Jokes aside, this is a concept I’ve believed heavily in since I started working professionally with SRE organizations 10+ years ago. Chaos engineering is a practice that intentionally injects faults into a production system to test resiliency and build confidence in the face of resiliency failures. Basically, it challenges you to break something to see how fast you can react and recover to an outage, almost like intentionally popping a tire on your car to see how well you react and can change it.

This seems applicable to security, no? That’s where Low’s post comes in to test the idea. First, Low makes a gentle introduction to the concept and then presents a test architecture and a threat model in an AWS environment to experiment with.

Figure 2: Architecture after GuardDuty detects unexpected activity and the security team isolates the EC2 instance

In this scenario, a microservice experiences some unexpected security activity and GuardDuty generates an alert. If you shut down an EC2 instance, what exactly happens? Enter Chaos Engineering!

There are five steps in a Chaos Engineering experiment: defining the steady state, generating a hypothesis, running the experiment, verifying the effects, and improving the system. This has a nice carryover for testing detections and their infrastructure in production states.

Steady State: What is our baseline for MTTR and MTTD? What is the general uptime of our log sources? What configurations are in place to prevent attack paths?
Hypothesis: When a workstation queries a known malicious domain, our SIEM will detect it within 15 minutes, notify the security team within 2 minutes, and the machine will be contained 1 minute after that
Running the experiment: Load a benign domain inside your threat intelligence look up tables, remotely connect to a machine and perform a DNS lookup for the benign domain.
Verifying the effects: Did we generate an alert in the SIEM? Was there a Slack notification to contain the host? Did it fall within our hypothesis’ parameters?
Improving the system: The Slack alert did not defang the domain, the containment tooling only blocked the domain and not the resolution IP

I love this approach, and I’m unsure whether any companies are considering this type of fault or “adversary injection”- style testing. Breach Attack Simulation products focus on coverage of rules, but I haven’t seen anyone think about this from a Detection & Response validation angle.

🔬 State of the Art

A Retrospective Survey of 2024/2025 Open Source Supply Chain Compromises by Filippo Valsorda

In this post, Valsorda performs a retrospective survey analysis of all open-source supply chain attacks from 2024 to 2025. At Datadog, we collect 100s to 1000s of these types of malicious packages to help defend our environment, but a supply chain compromise is more than just a malicious package. These last 3 months alone have had compromises that made mainstream news, such as Shai-Hulud and s1ngularity.

Valsorda grouped the root causes of 17 major attacks to help readers understand initial access and subsequent attack paths. Funny enough, phishing was the number one root cause of these package takeovers, and the number two was a new attack path I haven’t been able to put into words: control handoff. The basic premise behind control handoffs is that it’s part social engineering, and IMHO, part insider threat. For example, the infamous xzutils attack originated when a developer gradually added a backdoor to the library over time. The polyfill[.]io attack involved purchasing a domain that had expired and the new owner served malicious Javascript to victims.

It’s a fascinating read as a survey blog, but it highlights how fragile the open-source software ecosystem is. It’s unfair how large companies and organizations demand feature and security work from some of these projects without pay, and, understandably, burnout from these demands has become a real security issue once attackers exploit them.

Re-Writing the Playbook — A detection-driven approach to Incident Response by Regan Carey

Merging governance, risk and compliance documents and policies across an organization is difficult. I think the most salient example of incorporating a policy into practice is mandatory 2FA. You write a policy that mandates 2FA, perhaps based on a SOC2 or ISO27001 audit, and your IT team buys physical YubiKeys and Google Workspace to ensure that all authentication requires a USB-C dongle.

This gets harder and more nebulous in the threat detection space. 2FA is clean and measurable; you can pull reports of the number of employees enrolled in 2FA and drive it to completion. But, how do you drive a Ransomware Response Playbook into completion? Is it that you have a playbook? Is it that you have EDR tooling, plus a playbook? Or is it that you have a playbook, you have EDR tooling, and you have Bob from IT who presses a button when an EDR fires?

But what about individual rules that respond to ransomware? Are they firing accurately? Is the SPECIFIC response playbook inside the rule up to date? When do you know it's out of compliance with the overall playbook? I think the answer is: you don’t and you won’t. This is where Carey begins their exercise and proposes their Incident Response Diamond concept.

Translation and mutation of data can result in loss of specificity, which is no different from a data engineering pipeline problem. Data engineering solves this through meticulous field mapping and clear documentation. I think this is what the Diamond concept Carey is proposing here. Basically, they define a handoff between non-technical playbooks into rules, but they keep a lineage of how certain playbooks are invoked by rules so you know which policy it falls under.

I think this is a great approach, but it means your security response and GRC teams need lots of alignment to pull it off. Documentation is one of the hardest parts of security, and keeping rules up to date is already hard enough.

Fantastic AWS Policies and Where to Find Them by David Kerber

The hardest thing in Computer Science is cache invalidation. The second hardest thing in Computer Science is naming things. For security, I think the hardest thing is understanding cloud identity models. The second hardest thing is also naming things.

One of the best ways in AWS to reduce the blast radius of attacks, or prevent attacks altogether, is to leverage the myriad of AWS policies that they make available to customers. But a word of caution from Kerber: the amount of tools you have at your disposal here can also be your downfall. In fact, as Chester Le Bron puts it:

You now need to become a SME in the operating system called AWS and its core services, some of which (like IAM) could be considered its own OS due their complexity

So, in this post, Kerber outlines every type of AWS policies to help manage access. There are several types, some allow you to Allow or Deny access, while others only Deny, and you can split these types across things like Users, Resources, Service Accounts and even GitHub Actions.

Luckily, each section is split up to help folks use this blog as a reference post in case you need to come back to remember. They also open-sourced a tool called iam-collect to help retrieve all of these policies locally for analysis. I’ll list the tool at the open-source section at the bottom of this week’s issue!

Introducing CheckMate for Auth0: A New Auth0 Security Tool by Shiven Ramji

CheckMate is a free Auth0 tenant configuration tool that operates as a CSPM for Auth0 deployments. They have several checks for all kinds of misconfigurations present in the Auth0 environment, and you can run them on an interval to detect drift of the environment and fix it before it becomes a problem. One of the cool parts here that is less CSPM-y from a pure security product perspective is their extensibility runtime checks. It’ll do several checks against custom Auth0 runners to find everything from hardcoded passwords to vulnerable npm packages.

☣️ Threat Landscape

UN Convention against Cybercrime opens for signature in Hanoi, Viet Nam by United Nations Office on Drugs and Crime

The United Nations host their “Convention on Cybercrime” in Vietnam last week. Besides sounding like a sick conference (I hope someone wore a hacker hoodie), they had 72 countries sign an international treaty that provides guidance and guardrails for nations to battle international cybercrime. The post has some interesting highlights from the treaty, including standards for electronic evidence collection, the ability to share data easily, and it recognizes that the dissemination of non-consensual sexual images is an offense.

Lessons from the BlackBasta Ransomware Attack on Capita by Will Thomas

Cyber threat intelligence G.O.A.T. Will Thomas dissected the 136-page ICO report on Capita Group’s breach by BlackBasta in 2023 for some juicy intelligence and lessons learned. The cool part of this is that Will found messages from the BlackBasta chat leak that line up with the timeline published in the ICO report.

It’s nice to get commentary from a CTI expert on publicly facing penalty notices and disclosures. Lessons learned are great at a high level, but digging into exact TTPs from BlackBasta and comparing them to the material failures within the security program at Capita are way more useful to the rest of the security community.

CVE-2025-59287 WSUS Unauthenticated RCE by Batuhan Er

This week, Microsoft released an out-of-band vulnerability update for its Windows Server Update Service (WSUS) product. WSUS allows Microsoft administrators to manage the installation Windows updates in their fleet. The deserialization vulnerability results in Remote Code Execution, so Microsoft labeled CVE-2025-59287 as a 9.8.

In this vulnerability walkthrough, Er follows the vulnerable code path and ends with a PoC to exploit the vulnerability. The discovery here is that WSUS deserializes encrypted XML objects unsafely in the GetCookie() endpoint. You can send over any arbitrary object (or a specially crafted one) to get RCE.

Exploitation of Windows Server Update Services Remote Code Execution Vulnerability (CVE-2025-59287) by Chad Hudson, James Maclachlan, Jai Minton, John Hammond and Lindsey O’Donnell-Welch

As a follow-up post to Er’s above, the Huntress team found in-the-wild exploitation of CVE-2025-59287. A handful of their customers had Internet-exposed WSUS servers. When the vulnerability details and subsequent PoCs dropped, attackers leveraged the exploit against exposed servers. Most of the activity looked like initial reconnaissance, but this post goes to show how fast you have to react to emerging vulnerabilities, especially when you have misconfigurations that could have prevented exploitation.

The team also dropped a Sigma rule and IoCs for readers to hunt on.

Hugging Face and VirusTotal: Building Trust in AI Models by Bernardo Quintero

This is a ~small product update for VirusTotal’s integration into HuggingFace’s registry of AI models. I usually don’t post product updates, but both VirusTotal and HuggingFace are community-driven products. It’s nice to see the VirusTotal team commit to helping developers identify malicious models hosted on HuggingFace.

🔗 Open Source

auth0/auth0-checkmate

GitHub link for the Checkmate project that was open-sourced by the auth0 team. You can see all of their checks in code and it looks like it operates similarly to how prowler works.

cloud-copilot/iam-collect

Kerber’s iam-collect repo from the story I linked in State of the Art above. Give it access to your AWS environment and it’ll rip through the IAM policies and download them to disk. It links to a separate GitHub project called iam-lens to help simulate and evaluate effective permissions.

EmergingThreats/pdf_object_hashing

PDF Object hashing is a technique similar to imphash where you compare structures of PDF documents without focusing on the content inside. impash is a helpful technique with identifying similar binary features and symbols so you can cluster malware samples to find new ones. This follows the same philosophy so you can cluster malicious PDF documents using similar techniques.

chainguard-dev/malcontent

I’ve been following Chainguard’s malcontent project for a while and it looks like they’ve been throwing a lot of development at it. It’s a supply-chain compromise detection system that uses a butt-ton (yes, a butt ton) of analysis techniques, including close to 15,000 YARA detections, to help detect these compromises before they make it into your build and production systems.

ForensicArtifacts/artifacts

Machine-readable knowledge base of forensic artifact information. It has a good amount of yaml files that store metadata around specific sources and what files and directory paths you can use during forensic analysis.

Detection Engineering Weekly
DEW #134 - Prioritizing Critical Assets, AI SOC means MORE alerts and Microsoft CoPilot Phishing 22 October 2025 at 14:03

DEW #134 - Prioritizing Critical Assets, AI SOC means MORE alerts and Microsoft CoPilot Phishing

Detection Engineering Weekly

By: Zack Allen

22 October 2025 at 14:03

Welcome to Issue #134 of Detection Engineering Weekly!

✍️ Musings from the life of Zack in the last week

I popped and tore muscle/cartilage in my ribs on Friday. Urgent care sent me to the ER, and the ER laughed at me and said I’m too young to hurt my ribs and come to the hospital, so they sent me home D:
I’m booking a (small) solo trip to London in December. Who’s got restaurant and more importantly Pub recommendations in the Soho area? Shoot me a message and i’ll buy you a (virtual or not) pint!
I get some AMAZING content sent to me in all kinds of mediums, but it’s hard for me to keep track. So, I made a submissions form @ https://submit.detectionengineering.net that sends your blog details straight to my Notion. If you are writing something, I want to know!

This Week’s Sponsor: detections.ai

Community Inspired. AI Enhanced. Better Detections.
detections.ai uses AI to transform threat intel into detection rules across any security platform. Join 9,000 detection engineers leveraging AI-powered detection engineering to stay ahead of attackers.
Our AI analyzes the latest CTI to create rules in SIGMA, SPL, YARA-L, KQL, and YARA and translates them into more languages. Community rules for PowerShell execution, lateral movement, service installations, and hundreds of threat scenarios.
Join @ detections.ai
Use invite code “DEW” to get started

💎 Detection Engineering Gem 💎

Critical Asset Analysis for Detection Engineering by Gary Katz

If everything is Priority, nothing is Priority

I think about this mantra when I am looking at team planning for our security org at $DAYJOB. Security has a thankless job in many ways: when things go wrong, we are both in the spotlight and under a microscope. When things go well, we may seem invisible to others. This means scrutiny comes at the worst times, such as during an emergency, and the amount of planning and prioritization you do beforehand can really showcase how mature you are as a security program.

Lots of detection blogs I read talk about sending telemetry into a SIEM or a logstore and how to run detection logic over that telemetry. These blogs have a large assumption: every piece of telemetry is created and maintained equally. In the real world such as business, this is the furthest from the truth. A workstation going offline versus a domain controller going offline is an example here, and what Gary calls a “chokepoint.”

These chokepoints are assets that become the biggest target for adversaries, and labeling them as Critical Assets provides clarity to your security team and your leadership team that you are putting focus in the right spots. The Critical Asset approach here requires conversations up and down your reporting chain, but it should render insights into what a detection team should prioritize first:

I love this approach because it shifts the conversation away from 100% MITRE coverage across everything to focused and directed coverage on your organization’s most critical services and assets. According to Katz, this methodology should output a prioritized list of assets, relevant attack paths, and coverage metrics that you can provide to others in your organization to showcase the value in peacetime (not during an incident).

The only part of this approach that I struggle with, not specifically with Katz’s but in general, is that it’s hard to highlight coverage on assets as the list grows.

🔬 State of the Art

How AI Transforms Detection Engineering by Filip Stojkovski

Like most things in security, detection engineering is a capacity problem. Every security operations function has three knobs to dial to scale their org, and they all come at some cost: people, process, and technology. SOCs address the need for scale through people, but it’s not linear because you can only triage more alerts with more people. This is where process and technology help scale the function, especially if you have a solid engineering foundation and a healthy department culture that constantly updates processes.

One of a detection engineer’s most potent “knob” is tuning how much or how little threat activity and benign traffic you capture. So, according to Stojkovski, this knob has always leaned towards precision (what we capture is relevant), as we don’t want to overwhelm the capacity of triage analysts. But, does this change with the advent of AI SOC technology?

Stojkovski argues it does, and I will have to agree here. LLMs help us turn the “technology knob” way way way up, which means we gain a scale advantage that isn’t pinned to linear growth of humans. I also really like their nuance that the focus of this tech should be on true positive benigns and false positives, which means analysts focus more on real incidents versus wasting time tuning alerts that can waste 10-15 minutes of an analyst’s time.

CoPhish: Using Microsoft Copilot Studio as a wrapper for OAuth phishing by Katie Knowles

~ Note, Katie works at Datadog and is my colleague ~

AI-based features introduce risk we’ve never seen before, and it’s easy to see why the hype matters. Prompt injections lead to some funny outcomes, but the more overlooked part of AI implementation is tried-and-true vulnerabilities. Developer teams are being forced to push features out so they don’t last to market, and misconfigurations and non-standard development workflows creep into production, leaving users and organizations alike vulnerable.

This is the case with Katie’s latest research into Microsoft’s CoPilot studio. CoPilot studio is Microsoft’s workbench product for developers who want to create AI chatbots. According to Katie, it has some confusing UI/UX workflows for authenticating to a chatbot, as well as poor permission structures, which allow attackers to create OAuth Consent Phishing attacks.

An attacker can use a malicious Copilot Studio agent to trick a target into an OAuth phishing attack. The attacker or agent can then take actions on the user's behalf (click to enlarge).

Desired State Configurations by smash_title

This is the first time I’ve heard of Microsoft’s infrastructure-as-code and configuration management policy language, Desired State Configurations (DSC). So, this was a helpful post for me to understand Microsoft’s approach to DevOps using native tooling from the hyperscaler. smash_title came across this technology set while creating a detection engineering-style lab for Azure Virtual Machine Windows and Linux detection testing.

It does look similar to the likes of Terraform and Ansible depending on which of the three versions you are using. There are some neat features that I don’t think I’ve seen in other similar technologies, such as drift detection and correction, and workstation resource management. It looks like Microsoft is sunsetting the earliest version that relies on PowerShell, and wants to move to a pure JSON/YAML-style declarative format, but they seem to be pretty far away from feature completeness on the newer versions.

Introducing HoneyBee: How We Automate Honeypot Deployment for Threat Research by Yaara Shriki

HoneyBee is an open-source toolset that automates the creation of honeypot stacks leveraging LLMs. Unlike other honeypots that put LLMs inside the web-app to mimic an environment, this one focuses on the configuration management and infrastructure component, which I think is a much more fruitful approach for detection engineers.

You provide access to your favorite foundational model, select a technology stack, and select one or many misconfigurations in the Wiz catalog, and it generates docker-compose files for use. This is helpful when you are building detections for specific stacks and want to see how telemetry is generated after a misconfiguration is exploited. Alternatively, you can deploy this on a honeypot listening on the Internet to collect indicators of compromise.

From Logs to Leads: A Practical Cyber Investigation of the Brutus Sherlock by Adam Goss

This is an in-depth walkthrough of the forensics challenge “Brutus” on hackthebox. I like Goss’s approach of splitting the investigation into four distinct skillsets: interpretation, collection, capability comprehension, and manipulation. Each one of these skills involves understanding a target system’s technology stack, gathering necessary data from various sources, and then using the tooling you have at your disposal to interpret the timeline of events.

☣️ Threat Landscape

[RESOLVED] Increased Error Rates and Latencies by Amazon Web Servers

What a crazy turn of events: a cascading DNS failure starting in AWS DynamoDB, which then affected an internal service supporting launching EC2 instances, which then messed up health checks on load balancers and spread through 142 separate services.

To Be (A Robot) or Not to Be: New Malware Attributed to Russia State-Sponsored COLDRIVER by Wesley Shields

This GTIG blog is a great example of how threat actors can rapidly adjust their malware development as they deploy it. Shields profiles COLDRIVER (aka Star Blizzard)’s new malware delivery chain. It uses phishing as the initial lure, which leads to a ClickFix infection. During the infection, COLDRIVER leveraged a clunky Python-based backdoor, then began simplifying the malware away from Python and focusing on PowerShell. It looks like COLDRIVER abandoned Python because it needed a Python runtime to execute, whereas PowerShell is native functionality in their victim set.

Email Bombs Exploit Lax Authentication in Zendesk by Brian Krebs

Threat actors bombarded customers of large Zendesk customers last week using flaws in how Zendesk is configured. The misconfiguration allows people who have access to company Zendesk portals to send out ticket creation notifications that come from the company domain. Most of these were spam and troll-style messages, even some accusing Krebs of breaking the law.

But it goes to show how SaaS apps have multiple layers of configuration and can lend themselves to abuse scenarios like this if someone looks hard enough.

Revelations on Group 78, the secret US task force that fights cybercriminals by Martin Untersinger and Florian Reynaud

I was skeptical reading this headline because I’ve been burned by mysterious marketing-style blog posts, but then I realized it was an expose from Le Monde. Untersinger and Reynaud provided readers some extraordinary background into the alleged FBI Ransomware Disruption Taskforce, Group 78. The goal of the group is to perform ransomware disruption operations, up to and including arrests of suspected ransomware operators. They leverage a variety of legal and more modern tactics, such as exposing criminals' identities.

The hope is to pull all the levers they can find to degrade the trust between ransomware groups, and to be honest, I like this approach. For example, Untersinger and Reynaud assert that the ExploitWhisperer leak of over 200,000 BlackBasta Telegram messages may have been from Group 78.

🔗 Open Source

smashtitle/DesiredStateConfigurations

smashtitle’s GitHub repository for their DesiredStateConfigurations research, I posted above in the State of the Art section. The cool part about this is that it’s a single PowerShell script that sets up a lab environment tailored for detection engineering on Windows. It removes a lot of B.S. out of the box services and applications that may cause a lot of noise for people who run the lab.

yaaras/honeybee

Repository from Shriki’s research on building honeypots using LLMs. They have a neat misconfiguration index that you can use as a dropdown in your prompt on specific technologies, so that you not only build the honeypot but also intentionally misconfigure it for detection rule coverage and lure the bad guys to exploit it.

dobin/DetonatorAgent

Detonation platform for malware development and telemetry collection. The initial idea was to develop malware and test it with Windows via the DetonatorAgent Virtual Machine. It can collect telemetry from the environment as well as from EDR.

google/osdfir-infrastructure

Helm Charts for various open source DFIR infrastructure built at Google. You can run things like minikube locally to take advantage of this, or even deploy it up on managed Kubernetes on AWS or GCP.

Detection Engineering Weekly
DEW #133 - Redefining Security Visibility, TTP-First Hunting & F5 breach 16 October 2025 at 14:03

DEW #133 - Redefining Security Visibility, TTP-First Hunting & F5 breach

Detection Engineering Weekly

By: Zack Allen

16 October 2025 at 14:03

Welcome to Issue #133 of Detection Engineering Weekly!

✍️ Musings from the life of Zack in the last week:

I did a family road trip for the long weekend to my hometown. I’m happy to report to other parents that I’ve had my first experience of a kid throwing up in the backseat. Do I earn a badge of honor here?
Datadog Detect is BACK for round 2, so please sign up and see some excellent Detection Engineering talks! It’s free, fully remote, and there will be activities (yay!) and labs for conference goers.

⏪ Did you miss the previous issues? I’m sure you wouldn’t, but JUST in case:

💎 Detection Engineering Gem 💎

What Does “Visibility” Actually Mean When it comes to Cybersecurity? by David Burkett

The most frequent question I get from my boss at Datadog is “Are we covered?” It’s a simple question, but it’s extremely hard to answer. What does covered mean? Are we covered now, before, or in the future? Do you mean MITRE rule mappings, operational maturity, incident readiness, or threat intelligence awareness? It turns out that agreeing on a singular definition of anything in security is difficult!

It was nice to read Burkett’s post here discussing the varying definitions of visibility. Like most industry standards, several companies and organizations have attempted to define visibility, but no single standard or definition has emerged as the true winner. David adapted Splunk’s blog on observability into the security operations space, and I think it works beautifully:

Visibility is the holistic state wherein a system generates telemetry, is subject to robust monitoring for known conditions, and possesses observability, enabling deep, exploratory analysis to diagnose novel problems. Full visibility is achieved only when these three elements are cohesively integrated, allowing operators to move fluidly from detecting a known issue (monitoring) to exploring its unknown root cause (observability), all supported by a common foundation of high-quality data (telemetry).

He then fits this mental model into a 3 tiered definition based on who is asking about visibility. The three tiers look like they are inspired by tiered types of threat intelligence: strategic, operational and tactical. This is also a great approach because visibility means something different based on the customer you are talking to.

Senior leaders typically care about the full visibility of the business, not necessarily the individual elements along the ATT&CK chain. When you get to operational, you focus on the attack surface, such as endpoint, network, and SaaS. Each one of these attack surfaces can have many telemetry sources, think EDR and Secure Web Gateway for domain visibility. Lastly, he rounds out tactical visibility by examining specific telemetry sources, like EDR, and moving through MITRE ATT&CK to assess visibility in each stage.

All models are wrong; some are useful. This may not be “perfect” in terms of defining visibility, but in my opinion, it’s a good mental model. It pulls inspiration from SRE concepts like observability and fits that into the context of a security program’s healthiness based on the customer who is asking.

🔬 State of the Art

Hunting Beyond Indicators by Sam Hanson

Threat Hunting is the art of managing false positives. The basic idea is that you switch the premise of triage. Detection engineering and hunting means you want to cast a wide net in your queries to find needles in a haystack, but in the former, you want as little hay as possible. Maybe I can keep this imagery going and talk about separating wheat from chaff?

Alright, alright, enough farming analogies. I included this post because it shows the tradeoffs of hunting when starting with threat intelligence indicators versus adversary TTPs. When you plan and execute a threat hunt, the expectation is to find many results and have time to sift through them, using down-selection techniques to determine if there is an intrusion. The order of down-selection matters, though. According to Hanson, you want to start with tactics and techniques first (which I agree with), and then filter by other components like threat intelligence indicators.

If you start with threat intelligence indicators, you introduce a selection bias because they are brittle selectors and, by nature, won’t catch unknown IOCs. Focus on TTPs first, down-select to find unknown IOCs, and feel free to use IOCs after for additional enrichment.

Intuition-Driven Offensive Security by Andy Grant

When I first started working in security, becoming a red-teamer or a pentester felt like a class of jobs reserved only for the most technical experts in the field. There’s something beautiful in deconstructing assumptions of systems, building tools to probe those assumptions for weaknesses, and then exploiting those assumptions to achieve that objective. At the time, I was only aware of jobs at consulting firms that had intense interview processes, so I never felt I could make it.

As I progressed in my career, I started to meet and work with red teams. They typically fit into a mold where they engage and produce a report. As a blue teamer, it was hard for me to understand the value of a report when the engagement with that same team stopped after the delivery. I think this was the same feeling that some other companies felt after engaging a pentesting firm. The hard work started with the findings, not the engagement.

Grant visits this concept and provides a better working model for red teamers that he dubs as intuition-driven security. The three principles he lays out focus on understanding the risk behind an implementation rather than hunting and reporting bugs. IMHO, this is a much sounder approach because it forces red teamers to think like a security engineer rather than a pentester. If the outcome is risk reduction, the incentive structure rewards knowledge of the engineering behind a service. This knowledge drives empathy of the problems the service solves and serves as a forcing function on closing the security gaps the team finds during an engagement.

Practical Resources for Detection Engineers. || Starters 🕵🏻 and Pro || by Goodness Adediran

I love reading “Introduction to Detection Engineer” posts because you get a good diversity of thought around how to break into the field. Some folks focus on the expertise required to break into it, but can leave it vague enough to make it easy to retrofit into your life situation. Others may look at more tactical details like technologies to learn, such as SIEMs or languages. Adediran took an approach that I first saw from Katie Nickels’ in her series on self-studying for Threat Intelligence.

This post provides a self-study roadmap for readers who want to break into detection engineering. Adediran splits this up into foundational blogs on the subject, studying MITRE to get a better understanding of how it maps to rules, and then crescendos out to specialist subjects across several mediums like blogs, videos, books, open-source repositories and podcast episodes.

Purple Team Maturity Model: From Chaos to Controlled Chaos by Silas Potter

I’m a big fan of maturity models, because they set a clear direction and roadmap for a program or function, but leave enough wiggle room to add, remove, or change milestones to fit your business context. In my professional experience, they’ve helped me set a tone for reporting maturity to leadership and provide an excellent north star for folks reporting into my org. So, when a new “maturity” model pops up in my feed, I almost always read it and steal ideas to use for my own purposes :).

Purple Teaming is an excellent way to improve the operational robustness of your detection program, so I was pleased to see Potter’s approach here to quantify how to achieve a well-oiled purple teaming function. Notice that this isn’t about a specific team doing purple teaming; instead, it’s a program across multiple teams, the obvious one being the joining of red and blue teams. I like this approach because it helps unite two teams who may not be talking to each other and showcases the value of both functions by driving detection outcomes rather than churning out rules or red team reports.

☣️ Threat Landscape

K000154696: F5 Security Incident by F5

Network and security appliance F5 posted a harrowing security incident update involving a “highly sophisticated nation-state threat actor.”. This threat actor had long-term access to their product development environment, and according to cvedetails, F5 has close to 300 products. With the ability to download code and knowledge bases, a well-resourced actor could use that access to do product research and reverse engineering for competitive products in their home country or for the ease of vulnerability research.

Securing the Future: Changes to Internet Explorer Mode in Microsoft Edge by Gareth Evans

The Microsoft Edge security team installed a new secure-by-default configuration for Internet Explorer Mode in Microsoft Edge. This is the first time I’ve heard of Internet Explorer Mode, and I already had a chuckle reading this because I had a feeling it had to do with active exploitation of legacy Internet Explorer code shipped inside Edge, and voila!

The team seemed to plug the holes of some of the exploit vectors, but they switched off certain UI elements by default to limit the blast radius of threat actors abusing the backward-compatible technology. Basically, if you have to use this mode, it’s shipped with minimal functionality to access the resources you need, and an administrator must turn on any additional functionality.

Rubygems.org AWS Root Access Event – September 2025 by Shan Cureton / Ruby Central

Long-lived access key security incidents strike again! Cureton, the Executive Director for Ruby Central, published a detailed security incident report after a blog post disclosed to the open source community that a former maintainer had production access to Ruby’s AWS account. The blog showed several screenshots and a CLI command that purported the open source maintainer maintained access via an AWS Access Key.

In response to the post, the Ruby Central team performed a series of containment actions to remove this access, and did not accuse the maintainer of anything malicious. But the post and this incident report show how hard it is to maintain a governance structure for an open-source non-profit that relies on contractors and volunteers to maintain the project.

Singularity: Deep Dive into a Modern Stealth Linux Kernel Rootkit by MatheuZSec

Two weeks in a row, I’ve read some great pieces on modern Linux Kernel Rootkits, so it was nice to see this one looked at a rootkit leveraging ftrace style hooking for its persistence and evasion capabilities. MatheusZ breaks down the source code within the rootkit itself, including the hooking techniques, and highlights some differentiators between this rootkit and others in the space. The attention to detail the rootkit creator put towards concealment of directories, for example, shows how much of a cat-and-mouse game this is.

When you hide a directory, you may not be able to see its name or contents via list commands, but you may leak metadata that a hidden directory exists. For example, if a directory contains three subdirectories and you hide one, ls will show only two subdirectories. However, the parent directory’s link count (visible via stat or ls -ld) would still reflect three subdirectories unless adjusted.

This discrepancy between the visible subdirectory count and the link count is a forensic artifact that can reveal hidden directories. This rootkit accounts for the discrepancy and hooks a function to compute the number of links for backdoored directories accordingly.

🔗 Open Source

ngsoti/rulezet-core

This codebase serves the complete application running on rulezet.org. It looks like an open source version of detections.ai that you can host yourself. It pulls in open-source rulesets, and you can use it to manage your own rules via a community-style setup.

eset/malware-ioc

ESET’s long-running repository of malware IOCs is based on blog posts and investigations they’ve done over the years. It’s cool to see commits from close to a decade ago. Each subdirectory has a README describing the malware family and contains the associated IOCs.

KittenBusters/CharmingKitten

For the last two or so weeks, KittenBusters has been publishing commits to this repository that detail the operations behind Iran’s IRGC-IO Counterintelligence division. It is split up into “episodes”, and so far, three episodes have been published. It contains sensitive documents and malware code, and it looks like they will start doxxing certain officials in upcoming episodes.

cisagov/LME

Logging Made Easy (LME) is CISA’s initiative on leveraging open source tools to enable a security operations function on a budget. It uses Wazuh and Elasticsearch, and the target audience is for smaller shops with a small security team or none at all. Probably very helpful for state and local municipalities that CISA works with during incidents.

Detection Engineering Weekly
DEW #132 - Linux Rootkits Evolution, LLM Rule Evals, Oracle 0-day exploitation 8 October 2025 at 14:03

DEW #132 - Linux Rootkits Evolution, LLM Rule Evals, Oracle 0-day exploitation

Detection Engineering Weekly

By: Zack Allen

8 October 2025 at 14:03

Welcome to Issue #132 of Detection Engineering Weekly!

✍️ Musings from the life of Zack in the last week

I spent the weekend hiking in the White Mountains in New Hampshire with my family. Turns out hiking is much harder when you have to carry kids who are strapped in a backpack
I got excited for a new season of The Amazing Race, and all of the competitors are from a separate reality show?? It’s not good
I’m staying away from all discussion around Tayler Swift’s new album

⏪ Did you miss the previous issues? I’m sure you wouldn’t, but JUST in case:

This week’s sponsor: Material Security

No More Babysitting the Security of Your Google Workspace
While your employees communicate via email and access sensitive files, Material quietly contains what’s lying in wait—phishing attacks in Gmail, exposed Drive files, and suspicious account activity. Agentless and API-first, it stops attacks and triages user reports with AI while running safe, automatic fixes so you don’t have to hover. Search everything in seconds, stream alerts to your SIEM, and audit with detailed access logs.
Simplify Your Google Workspace Security

💎 Detection Engineering Gem 💎

FlipSwitch: a Novel Syscall Hooking Technique by Remco Sprooten and Ruben Groenewoud

I first cut my teeth on writing malware when I was the red team captain at my alma mater’s yearly cybersecurity competition. I took a special interest in writing malware for Linux for several reasons. It was a special combination of operating systems knowledge and nuanced differences between kernel versions and Linux distros. It also felt harder than Windows in peculiar ways. For example, Windows is extremely good at backwards compatibility, so writing a piece of malware that interacts with the Kernel in all kinds of ways stays consistent between versions. Whereas in Linux, a single Kernel version update can break backwards compatibility with legitimate and malicious software alike.

That’s what brings us to FlipSwitch. Elastic Security Researchers Sprooten and Groenewoud did a deep dive on the latest 6.9 version of the Linux Kernel and inspected how changes to an array that stores syscall addresses render a classic Kernel rootkit technique useless. The method relies on hooking addresses in the sys_call_table array to point to attacker-controlled code before trampolining back to the original syscall.

Line 10 is the change that killed rootkits like Diamorphine. This is where flipswitch comes in.

The Elastic team did a fantastic breakdown in their blog, so I’ll give my synopsis. The technique involves searching the running kernel’s memory for the specific opcode associated with syscalls that FlipSwitch wants to hook. This opcode is unique, as in, when you load the malicious Kernel module, you can leverage its privilege to look for 0xe8 , enumerate each offset address for the specific function you want to hook via the new x64_sys_call, then patch it.

It’s pretty elegant, and it shows how a singular protection can kill one class of techniques but open up another class to exploit.

🔬 State of the Art

Bridging the Gap: How I used LLM Agents to Translate Threat Intelligence into Sigma Detections by Giulia Consonni

I’m glad to see more research and homelab-style blogs on how to build detection engineering agentic systems. It demystifies some of the hype surrounding products in this space, and just like Splunk did with SIEM by creating a community edition, it makes it easier for people to enter our field. I immediately clicked on this post because the title really excited me, and the post didn’t disappoint!

Consonni’s project here involves building out an LLM Agent system that translates threat intelligence into detection rules. They leveraged http://crewai.com/ (which I had never heard of), a platform that helps host AI Agents, provides an SDK for writing those agents, and makes it seem easy to focus on building the system rather than worrying about architecture and scale. Consonni started with a prompt that included the whole workflow of “read report → extract TTPs → create rules,” and it did a terrible job due to the broadness of the request. They refined the process with a multi-agent setup, some more specific prompting, and switching foundational models; the resulting rules were impressive.

More than “plausible nonsense”: A rigorous eval for ADÉ, our security coding agent by Bobby Filar and Dr. Anna Bertiger

This post is an EXCELLENT read after the LLM detection rule creator post by Consonni listed above.

Determining the performance of a machine learning model is as old as the field of statistics itself. The basic premise behind performance measurement is building a predictive system, testing it against real-world data, and measuring its performance efficacy. Sound familiar, like detection rules, right?

Naturally, LLMs should have the same type of evaluation criteria for implementers to trust and verify performance. I haven’t seen a comprehensive evaluation framework for detection rules until I came across this post by Filar and Dr. Bertiger. The Sublime team built a detection evaluation framework for their LLM-backed detection engineer, dubbed ADÉ. The idea here is that the team tried to encode success metrics for new detection rules written in the Sublime DSL. These success metrics should be familiar to long-time readers of this newsletter and to those who have read my Field Manual posts.

They split evaluations into three steps: precision, robustness, and cost to deploy and run. The lovely thing about these three evaluations is that they really capture how detection engineers think about testing rules before they deploy them.

Precision measures accuracy and net-new coverage, which, according to Filar and Dr. Bertiger, is the marginal value a rule adds when running alongside existing detections against known campaigns.
The robustness steps dissect the rules’ abstract syntax tree to identify and penalize lower-value detection mechanisms, such as IP matching. Think of this as penalizing the lower parts of the Pyramid of Pain
The cost step looks at how many times the model took to generate a production-quality rule, the time to deployment of that rule, and the runtime cost of the rule in production

They list evaluations of several rules towards the end of the post, and I’m impressed by their performance. They compare the results to a human-written rule, and it appears to have performed well in some detection types against humans but underperformed in others. However, the idea here (in my opinion) isn’t to replace humans, but to augment us, and I think this framework helps achieve that.

How to Create a Hunting Hypothesis by Deniz Topaloglu

The best way to threat hunt is to challenge assumptions. In my experience, these assumptions typically fall into several buckets, including:

Rules that fail to capture threat activity
Telemetry sources contain threat activity that we haven’t accounted for
Threat intelligence informs us of something we should be aware of in the pyramid of pain

Forming a hypothesis, then, takes assumptions and tries to challenge them to uncover gaps in rules or telemetry, and in the worst case, find an incident that you’ve missed. It’s a formulaic process, but this post shows how powerful threat hunting can be when you lay out your assumptions and what you know so you can deep dive into a hypothesis.

Topaloglu starts with a piece of threat intelligence, maps out potential TTPs in MITRE, shows an example network diagram, and then creates a hunting plan. They lay out several scenarios and their corresponding SIEM search queries in several languages, and continue on to post-hunt activities for aspiring hunters to follow up on because threat hunts should provide more value than just confirming whether activity is present or not in a network.

The Great SIEM Bake-Off: Is Your SOC About to Get Burned? by Matt Snyder

Choosing a SIEM is like selecting a business partner. You need to ensure that you understand the strengths and weaknesses of each other and create an operating model to compensate for them. It’s great to see a blog exploring the topic of procuring a SIEM and the pain associated with switching from one deployment to another. This piece is beneficial for aspiring analysts or detection and response engineers who’ve never been through this type of exercise, because it truly feels like a mountain to climb that can put your company and productivity at risk.

Snyder points out five key areas of concern where switching costs can kill productivity: ingest, search, enrichment, rules and administration. SIEM vendors should help you understand each component during a demo. Even then, many demos showcase the best parts of the technology, so a bake-off between SIEM vendors, via proofs of concept, and Snyder’s linked Maturity Tracker, can alleviate much of the uncertainty behind these exercises.

☣️ Threat Landscape

CrowdStrike Identifies Campaign Targeting Oracle E-Business Suite via Zero-Day Vulnerability (now tracked as CVE-2025-61882) by CrowdStrike

The large vulnerability news du jour is a remote code execution in Oracle E-Business Suite tracked under CVE-2025-61882. The CrowdStrike research team made this post detailing their observations as threat actors and researchers alike conduct mass exploitation to take advantage of the vulnerability.

The exploit chain involves a series of crafted payloads to two jsp endpoints, where an unauthenticated attacker uploads a malicious xslt file. This, in turn, creates an outbound Java request to an attacker-controlled command and control server to load a webshell on victim machines.

The remarkable aspect here is how the exploit was disseminated. Oracle made a public post with IOCs, a PoC was posted on October 3, and according to CrowdStrike, threat actors under the ShinyHunters moniker posted an exploit file to their main Telegram channel.

Red Hat Consulting breach puts over 5000 high profile enterprise customers at risk — in detail by Kevin Beaumont

Red Hat Consulting, the technology services arm of Red Hat, allegedly suffered a data breach from a threat actor group dubbed “Crimson Collective.” It’s unclear how this breach happened, but they began posting screenshots of the pilfered victim data. Beaumont uncovered some interesting details about this threat actor group, thanks to the assistance of Brian Krebs. They seem to overlap with Scattered Spider/Shiny Hunters, and one of the Telegram posts made by the group had a “Miku” signature at the end. Miku is an alleged member of Scattered Spider and was arrested last year, but is on house arrest.

The victim details were posted on the Scattered LAPSUS$ Hunters victim leak site, and it appears to contain a trove of customer data from Red Hat Consulting, including some sensitive information.

DPRK IT Workers: Inside North Korea’s Crypto Laundering Network by Chainalysis

My favorite thing about reading Chainalysis blogs is getting a glimpse into how money laundering works at a cryptocurrency scale. Unless you’re a freak of nature and read indictments or court documents with detailed notes on traditional money laundering techniques, it’s rare to see how criminal and nation-state operations do the hard work of funneling money.

So, in this blog, the Chainalysis team studied the tactics, techniques and procedures of DPRK IT Worker laundering. They have a structured approach to taking payment in stablecoins, laundering it to a “consolidation” worker, and eventually offloading the consolidated funds to fiat.

Don’t Sweat the *Fix Techniques by Tyler Bohlmann

When I first read about ClickFix, I didn’t think it would be a successful approach to infection and initial access. The premise was a bit crazy: you funnel victims to a website, socially engineer them to believe there’s a problem with their computer, and convince them to willingly copy and paste a malicious command into their terminal.

Well I was wrong; this technique works beautifully, and according to Bohlmann, Huntress has observed a 600%+ increase in these styles of attack since their inception last year. In this post, they review the different styles of ClickFix, the attack chains and how they use clever ways to trick users to running the malicious payloads.

🔗 Open Source

1337-42/FlipSwitch-dev

Sprooten’s FlipSwitch PoC repo is referenced in the Gem above. It does more than just demonstrate the technique; you can use this as a rootkit kernel module in the latest versions of the Linux Kernel, and it supports some fun obfuscation techniques to make it harder to find.

ti-to-sigma-crew

Threat intelligence report to Sigma rule generator. This repository is based on the research linked above by Consonni. It looks pretty easy to use a templated CrewAI application, add knowledge files like detection rules as examples, and it looks like a SQLite database for RAG components.

matt-snyder-stuff/Security-Maturity-Tracking

Simple yet effective security maturity tracking framework for a security operations program. The repository lists each capability you want to track, such as SIEM, Threat Hunting and Threat Intelligence, and you can create maturity matrices for each one and track progress. These are generally pretty good at presenting up to leadership on program development.

thalesgroup-cert/suspicious

Open-source anti-phishing and investigation application for investigators, analysts and CERT folks. You set it up, tie it to an inbox, have users forward suspicious emails to it, and it’ll pull apart the email, perform threat intel lookups and present a report for further analysis.

CERT-Polska/karton

A dynamic malware analysis platform where you can build malware processing backends all in Python. It comes with several backends out of the box, including a malware sandbox, an archive extractor, and a malware configuration extractor. It looks pretty easy to write your own, and you can submit it via an API or the dashboard to extend functionality.

Detection Engineering Weekly
DEW #131 - ❄️New EDR bypass❄️, CTI Poverty, AWS Infra Canaries & Hunting in IMDS 1 October 2025 at 13:27

DEW #131 - ❄️New EDR bypass❄️, CTI Poverty, AWS Infra Canaries & Hunting in IMDS

Detection Engineering Weekly

By: Zack Allen

1 October 2025 at 13:27

Welcome to Issue #131 of Detection Engineering Weekly!

✍️ Musings from the life of Zack in the last week

My new office desk is done, and my office feels so much more organized with better use of space
I learned you can 3D print How To Train Your Dragon toys and stole one of these from my kid, who got it as a present
Got a ticket to DistrictCon, so I’ll hopefully see you some you in person!

⏪ Did you miss the previous issues? I’m sure you wouldn’t, but JUST in case:

🚨 𝗗𝗲𝘁𝗲𝗰𝘁𝗶𝗼𝗻 𝗲𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝘀, 𝘁𝗵𝗿𝗲𝗮𝘁 𝗵𝘂𝗻𝘁𝗲𝗿𝘀, 𝗖𝗧𝗜 𝘁𝗲𝗮𝗺𝘀—𝘁𝗵𝗶𝘀 𝗼𝗻𝗲’𝘀 𝗳𝗼𝗿 𝘆𝗼𝘂.
Join us LIVE on October 7th for “𝗙𝗿𝗼𝗺 𝗧𝗵𝗿𝗲𝗮𝘁 𝗜𝗻𝘁𝗲𝗹 𝘁𝗼 𝗗𝗲𝘁𝗲𝗰𝘁𝗶𝗼𝗻 𝗥𝘂𝗹𝗲𝘀 𝗶𝗻 𝗠𝗶𝗻𝘂𝘁𝗲𝘀 (𝗡𝗼𝘁 𝗛𝗼𝘂𝗿𝘀)” — a hands-on webinar with 𝗱𝗲𝘁𝗲𝗰𝘁𝗶𝗼𝗻𝘀.𝗮𝗶

Presenters: Aaron Mog & Tim Peck from detections.ai

Let’s stop drowning in intel and start deploying smarter.
📅 Save your spot now 👇

🌍 APAC: 10:00 AM SGT: Register
🌍 EMEA: 2:00 PM GMT: Register
🌍Americas: 11:00 AM PST: Register
Join @ detections.ai - Use invite code “DEW“ to get started

💎 Detection Engineering Gem 💎

EDR-Freeze: A Tool That Puts EDRs And Antivirus Into A Coma State by Zero Salarium

This is a clever attack against EDR tooling that exploits a vulnerability in Windows Error Reporting (WER), which can cause target processes to enter a suspended state. The race condition, known as EDR-Freeze, exploits a clever method to leverage the MiniDumpWriteDump function, a debug feature in WinDbg, to trick it into thinking it’s creating an object dump of the EDR process. However, since EDRs are protected by ProtectProcessLight (PPL), an anti-tampering method introduced in Windows 8, the attacking process must also be initiated with PPL.

So, the attacker starts the WER executable, WerFaultSecure.exe, and suspends the EDR process via MiniDumpWriteDump function. EDR-Freeze then monitors the EDR to be suspended, creating the race condition. They then suspend the WER calling executable, which means it blocks the EDR process from ever “unsuspending” itself.

It appears that some EDRs are affected, but it was interesting to see the various responses from different companies. For example, Elastic Researchers noted that the technique doesn’t work due to a rule they implemented to block the use of WerFaultSecure.

🔬 State of the Art

Intelligence Poverty and the Commercial Data Economy by Joe Slowik

Cyber threat intelligence (cyber threat intelligence) has been challenging to convince people outside of security of its usefulness in my career. Once you think about the kind of cool stuff you get to do in this genre of security, it seems evident that others would want it. However, I think this bias can cloud people’s perception of the usefulness of a security organization. It’s intangible in many ways, and depending on how mature your program is, it focuses on the “what’s out there” versus “what’s in here”.

Some of this bias is cultural, since CTI was born from the military and spy operations. The stakes are higher when someone’s life is at stake. But, when you introduce the cyber element, it becomes a more frustrating practice in information asymmetry. This information asymmetry is what breeds the market and vendors to sell it: they have data that you don’t.

This is why Joe’s post here is so timely and relevant (just like threat intel!) Many people, including myself, are using VirusTotal (now Google Threat Intelligence) this year. When a company has a monopoly on crowdsourced and expert-created cyber threat intelligence data, it can essentially charge what it wants. According to Joe, this economy of scale creates an “intelligence poverty” for those outside large organizations with a budget to compete.

It makes it even harder for people trying to break into the industry, or for those who do it as independent researchers, to take advantage of data that can be the difference between a breach discovery and not. I really wouldn’t know what to recommend for people who want to do more OSINT-style CTI using these platforms. I’m fortunate enough to be a consumer of these platforms or to have been given researcher accounts. Still, this commercialization may force new analysts to work in fewer places than before.

Our plan for a more secure npm supply chain by Xavier René-Corai

GitHub’s Director of Security Research published a post about GitHub’s response to the the last several weeks of supply chain attacks against npm. The biggest offender, the Shai-Hulud worm, demonstrated how fragile some of these ecosystems can be in terms of security. The open-source community reacted swiftly, starting to analyze the malware code and issuing warnings to GitHub. However, according to René-Corai, GitHub itself needs to take stronger action against these types of attacks.

The GitHub security team is moving towards three publishing options, which are a combination of reducing long-lived publishing tokens and “Trusted Publishing” via means like 2FA. They are also removing several publishing options, and some seem harder than others to implement. For example, they recommend moving 2FA away from OTPs to FIDO-based 2FA, but that can be cost-prohibitive or can be a logistical nightmare to get implemented.

IMDS Abused: Hunting Rare Behaviors to Uncover Exploits by Hila Ramati and Gili Tikochinski

Wiz researchers Ramati and Tikochinski perform a threat hunting deep dive on unusual IMDS usage across their customer environments. IMDS is a beast of a service - without instance metadata, it’s much harder for applications to understand configuration and service data related to the infrastructure they are running on. It’s run on 127.0.0.1, so theoretically, only the applications and the instance can access the service.

This configuration service is an attractive target for attackers, so if they can devise creative ways to access the API, they can use it to steal credentials and move from the instance to the cloud environment. Attackers, unlike services and code, don’t usually fall in the behavioral patterns of accessing the service, so this is where Ramati and Tikochinski start to hunt for compromises.

Once they baselined cross-customer usage of IMDS, they found three compromises related to N-day exploits against various services. I feel that threat hunting is primarily about baselining behavior and identifying outliers, and this blog is a great demonstration of that.

Introducing the AWS Infrastructure Canarytoken by Marco Slaviero

This is a neat feature update from Thinkst Canary, one of the OG companies offering canary token capabilities to security teams. Free-tier and paid users can now leverage their AWS Infrastructure Canarytoken, which is a specialized feature that deploys canary infrastructure. It leverages a combination of AWS permissions, Terraform files, and some special sauce to “learn” your AWS environment and deploy what it thinks is the best canary-style cloud resource. There are two required cross-account integrations: one involves giving temporary access to Thinkst so they can “learn” your infrastructure. The second is a long-term cross-account access that sends your CloudTrail events from the canaries to their main AWS account for alerting and processing.

☣️ Threat Landscape

I’m posting two quick-hit podcast episodes from friends of the newsletter, The Three Buddy Problem.

In this interview, Ryan & Juan interview Aurora Johnson and Trevor Hilligoss from SpyCloud. They gave an overview of a Com-like community in China that performs similar harassment and insider threat style crimes. The difference between this group, dubbed “Internet Toilets” (and the name of their talk), and The Com, is the access to much more persona data due to corrupt officials in local Chinese governments.

This episode is a 12 year lookback on Mandiant’s first ever threat report on APT1. This was a pivotal moment for cybersecurity as it showed how much visibility that private firms possess and how it can overlap nicely with government spy operations. I was 1 year into my career when I first read this report and I was blown away. I entered my first threat research job that same year and the rest is history :).

That Secret Service SIM farm story is bogus by Robert Graham

The big news last week involved the Secret Service busting a SIM farm. The PBS story I linked here claims it could have been used to “collapse telecom networks”. One of the agents had a quote suggesting that a nation-state might have run it.

Several news outlets started poking holes at that claim, and Graham’s piece here points out why. A possible reason why it sounded like a nation-state operation was it financial scale, but also the lead that led to the farm involved a text from this possible “spam farm.” It’s kind of like saying AWS was responsible for a nation-state hack from a China-nexus actor because it originated from an AWS IP.

September 26 Advisory: SNMP RCE in Cisco IOS and IOS XE Software [CVE‑2025‑20352] by Censys Security Research

The Censys team’s threat advisory on the latest Cisco vulnerability provides valuable information on the Internet exposure of these vulnerable devices. A specially crafted SNMP packet can lead to a stack overflow on these 192,000 devices. The prerequisite here is that the attacker must be authenticated to the device. A guest account or a low-privileged account can initiate the attack and get a DoS, whereas a high-privileged account can get RCE to pivot into internal networks.

Canary tokens: Learn all about the unsung heroes of security at Grafana Labs by Mostafa Moradian

Grafana’s Security Research team published this post as a follow-up to their security incident they experienced in May. I really enjoy reading about lessons learned from companies that suffer an incident like this, because firms tend to be risk-averse and not publish details. The follow-up by Moradian involves the use of canary tokens in their infrastructure to identify leaks in their source code.

The team had tokens placed throughout their codebase, and the incident involved exfiltrating their codebase during the attack. Logically, the threat actor leveraged TruffleHog to scan the codebase for exposed secrets. TruffleHog can sift through code, configuration files, and even code commits. You can configure TruffleHog to check the validity of the secret, so it’ll reach out to the various service platforms and look for a response indicating that the secret is live. Once the actor reached out to AWS, it issued a critical alert to Grafana’s Detection & Response team, and they were able to identify the repository from which the secret was stolen.

These tokens offer a cheap and effective way to get some high-fidelity alerts, especially in the case of exfiltration, such as what happened to Grafana.

🔗 Open Source

acquiredsecurity/forensic-timeliner

Security incident timeline builder for DFIR investigators. It can clobber output from several forensics tools, such as Chainsaw and Hayabusa, combines them into a singular format and it creates a nice queryable timeline. It has a GUI that looks a lot like Wireshark which I got excited about :).

gabriel-sztejnworcel/pipe-intercept

This is a neat named pipe interceptor for Windows that leverages WebSockets so you can view named pipe communication via tools like Burp. You specify a target named pipe via the command line argument, connect to the WebSocket via your preferred tool, and see the live IPC traffic over the wire.

awslabs/amazon-bedrock-agentcore-samples

Amazon recently launched AgentCore, their service providing agentic infrastructure. I linked their samples here because it seems pretty straightforward to get a full agentic infrastructure up for security use cases. For example, you can load in system prompts for security triage, leverage S3 as a vector database and upload runbooks and rule descriptions, and connect to their MCP servers for telemetry querying using natural language.

microsoft/avml

Memory acquisition tool for Linux. You compile it as a binary, load it on a target system and capture memory for offline analysis. Has some native functionality to upload to Azure blob storage. Uses the LiME output format once retrieved, though I’m unsure if Microsoft devs read that LiME is no longer being developed.

Detection Engineering Weekly
DEW #130 - God-mode Azure vulnerability, Composite Detections & Detection Observability 24 September 2025 at 14:03

DEW #130 - God-mode Azure vulnerability, Composite Detections & Detection Observability

Detection Engineering Weekly

By: Zack Allen

24 September 2025 at 14:03

Welcome to Issue #130 of Detection Engineering Weekly!

✍️ Musings from the life of Zack in the last week

I am putting together an Uplift desk and woo let me tell you there are a lot of pieces (electronics)
Had a fantastic time in NY for $DAYJOB and managed to spill pasta on my jeans at a team dinner, in front of my boss and CISO
Brought the family to an organic farm fair and watched a demonstration with a Border Collie gathering sheep. Made me miss my dog Pasha a lot :(
Vibe coded some infra for the newsletter to make news collection easier. Post to Discord → auto add to new issues in Notion

⏪ Check out last week’s issue if you’ve missed it!

This Week’s Sponsor: detections.ai

Community Inspired. AI Enhanced. Better Detections.
detections.ai uses AI to transform threat intel into detection rules across any security platform. Join 7,500 detection engineers leveraging AI-powered detection engineering to stay ahead of attackers.
Our AI analyzes the latest CTI to create rules in SIGMA, SPL, YARA-L, KQL, and YARA and translates them into more languages. Community rules for PowerShell execution, lateral movement, service installations, and hundreds of threat scenarios.
Join @ detections.ai
Use invite code "DEW" to get started

💎 Detection Engineering Gem 💎

One Token to rule them all - obtaining Global Admin in every Entra ID tenant via Actor tokens by Dirk-jan Mollema

I’ve rarely included vulnerability writeups in previous Gems, but this one was just way too good not to do so. For those not in the cloud space, the foundation of cloud security lies in hyperscalers’ shared responsibility models. Since the cloud involves running your stuff on someone else’s computer, there needs to be some guarantees that they are doing so in a secure manner. The keyword here is “shared”, as in, you are responsible for the security of your cloud deployment, but AWS, Azure, and GCP are accountable for the underlying technology, making it possible to run your apps on their services.

If you are a bug bounty hunter targeting Azure, for example, you have your target list above. You could pursue cloud customer deployments and hopefully receive some substantial bug bounty payouts. But the holy grail here is the boxes labeled “Microsoft”. Bragging rights aside, the infrastructure that they manage is opaque, both as a security control and probably because they don’t know everything they have turned on and off.

This is where Mollema’s vulnerability comes into play. Based on their post, I believe Mollema identified one of the most critical cloud vulnerabilities ever disclosed, located under Microsoft’s “Identity and directory infrastructure” category in the SaaS column. The vulnerability, CVE-2025-55241, involved combining three separate flaws in Azure, and then employing brute force once these flaws were combined.

While researching hybrid Exchange setups, Mollema uncovered an undocumented service called “Access Control Service”. This was used for intercommunication between services in the backend of Azure, and it issued what he called were impersonation tokens.

The first flaw was this service could create unsigned impersonation tokens and you could specify any user, as in, Entra ID wasn’t required to issue the token. The second flaw was that Azure AD Graph API, a legacy API, would accept these unsigned tokens as valid. Mollema could then issue Graph API requests to ANY AZURE TENANT GLOBALLY without the Graph API checking if he owned the target tenant.

This is impressive because it breaks the fundamental trust boundary you see in the shared responsibility model, and he could theoretically “hop” into any Tenant that he wanted. Most of this wasn’t generating any telemetry until you started creating or modifying resources in a victim account, so luckily, Mollema includes a KQL rule to detect this activity.

🔬 State of the Art

Threat-agnostic detection of co-occurring MITRE ATT&CK® events using Composite Detections by Ryan Tomcik

Composite detection rules are an evolution of atomic rules, where the presence of techniques across different pieces of telemetry can show threat actor activity within a sea of noise. The idea here is to mitigate the precision versus recall tradeoff by simultaneously using two or more rules for the alerting scenario. This provides flexibility in several dimensions, such as identifying activity presence using one strategy, like a string match, versus another, like a windowed threshold.

In this piece, Tomcik leverages the Google SecOps platform to show examples of composite detections via the lens of MITRE ATT&CK. Co-occurrence is another alerting technique that attempts to identify an intrusion through a sequence of events. Tomcik’s example of detecting co-occurring MITRE techniques alongside composite rules is a fitting illustration of composite rules in action. Discovery techniques are potentially the noisiest of techniques to catch in a live environment because they yield a lot of legitimate activity.

However, as Tomcik points out here, if you reconcile this telemetry with a singular source and find multiple discovery attempt techniques from one host, you achieve a high-fidelity alerting situation. The assumption here is that one host or log source shouldn’t be using several discovery mechanisms at once, so it could indicate a threat actor landing on a box and discovering their environment to collect information on where to pivot to next.

Resource Gathering by Amitai Cohen

This may not be a purely technical post, but I believe it’s highly relevant for anyone in detection engineering or security, especially as you begin your career. Amitai, a friend of the newsletter, describes situations in the workplace where we are bombarded with messages, emails, and requests to do stuff. And this is especially worse if you have meetings on top of your requests. So how do you manage all of that?

He draws the analogy of resource gathering from none other than Real-Time Strategy (RTS) games. It ultimately comes down to the question of return on investment. It’s good to notice when you see lengthy Slack DMs or channels, documents, or emails and see whether it’s worth your investment to generate what you want to impact in your day job.

In management land, a useful tool for this is the Eisenhower Box:

Granted, individual contributor folks don’t have the ability to delegate, but it’s still a useful mental model to focus on what matters.

Zero-Log Checker: Automating Log Absence Detection in Wazuh by Hanif Kurniawan A.

I’ve pontificated throughout this newsletter about how security tends to steal of concepts from other software engineering disciplines and rebrands them into something cool or sexy. Regression testing? THREAT EMULATION PEW PEW. Monitoring for weird or malicious logs? THREAT DETECTION BANG.

In all seriousness, it’s nice to see when concepts from software engineering enter our space and how we use them to solve real security problems. I’ve been shilling the idea of “observability for security” at my $DAYJOB, and every time I see a new post like Kurniawan’s here, it’s a good indicator that the health of your detection systems matters just as much as your rules. So, in this post, Kurniawan shows their system on how to detect when a log source goes down in Wazuh and how you can respond to it.

It’s a neat lab to me because I haven’t seen a lot of Wazuh content in the mix of Splunk, ELK and Sigma. The basic premise is you write a Python script to read in nd.json files on the local lab environment, compare them to historical files from the log source, check if it falls outside a window, and generate its own alert. Once alerted, you can review the log source health and ensure that forwarders are working, starting from the agent to the network setup.

🥊 Quick Hits

How I Built My First SIEM Detections by Garv Kamra

This is a neat “beginner” post on Kamra’s adventures in writing their first detection rules. It’s a lab environment where they set up a basic ELK stack, and focused on creating correlation rules, which are definitely a challenge if it’s the first time they’ve written rules!

Kamra wrote three correlation rules for credential stuffing, DNS tunneling and suspicious firewall usage. Kama wrote their assumptions, findings and lessons learned in each subsection. This is a great way for folks who are also trying this type of lab environment for the first time to learn.

Hunting Ideas in AWS Part 1 by Jake Valesky

This is Valesky’s first blog post ever, so I was excited to see how they wrote about a topic near and dear to my heart: threat hunting in the cloud. The blog explores effective log strategies, the various types of access keys that interact with your environment, and how to monitor unauthorized changes. I’ve linked to and written about AWS threat detection in detail in this newsletter. Blogs like this demonstrate the complexity of the environment and how it should be treated as its own operating system.

☣️ Threat Landscape

Microsoft seizes 338 websites to disrupt rapidly growing ‘RaccoonO365’ phishing service by Steven Masada/Microsoft Digital Crimes Unit (DCU)

I’ve always found DCU posts where they’ve legally seized actor infrastructure fascinating. When I think of legal action against threat actors, my brain immediately goes to law enforcement actions. This is logical, considering the numerous takedowns and arrests of criminals worldwide. But a private company doing this is on a whole other level!

The DCU team seized approximately 338 websites associated with an emerging phishing-as-a-service kit, RaccoonO365. It’s a subscription model, meaning you pay monthly, and according to the tech giant, it has accrued nearly $100,000 in payments paid directly to the creator. They’ve also allegedly identified the creator itself, a Nigerian man, and issued a “criminal referral”, which I’m guessing means they dropped the doxxing info to someone here in the FBI or to the Nigerian government.

Two teenagers charged over Transport for London cyber attack by Joe Tidy and Graham Fraser

UK police charged two teenagers, whom they believe were responsible for the months of disruption to Transport for London’s information systems. There were some shutdowns of the transit vehicles themselves, but most of the damage was in relation to their online services. According to the report, these two individuals were already in trouble for other cybersecurity-related crimes, with one having evidence on their machine that they were targeting U.S. healthcare companies. Smells like Scattered Spider to me!

SystemBC – Bringing the Noise by Black Lotus Labs

Detecting command-and-control (C2) servers and traffic is a pastime of mine. There’s something exciting about dissecting a piece of malware, analyzing its traffic, and then trying to identify where the bad guys host their command-and-control (C2) server so you can fingerprint it and gather additional threat intelligence as they move the C2 around the Internet.

This becomes much more challenging when the C2 server you are communicating with isn’t the actual C2 server. According to Black Lotus Labs, proxy networks are becoming increasingly popular among malware families, and proxy malware, such as SystemBC, helps facilitate this type of activity. The basic idea here is that an infection can route its traffic to the C2 server via an infected proxy to help masquerade the origin IP address. The Black Lotus team uncovered an extensive network of SystemBC Linux variant infections, attributed them to several grayware proxy providers, and blocked traffic to affected users.

Tech Note - BeaverTail variant distributed via malicious repositories and ClickFix lure by Oliver Smith

GitLab threat intelligence researcher Smith outlines some notable TTPs in DPRK’s BeaverTail malware. BeaverTail was first detected via Contagious Interview attacks, which relied on victims to clone a repository to perform a coding test and execute a backdoor embedded within the code.

According to Smith, they’ve pivoted towards ClickFix-style attacks, and my tinfoil hat says it’s because platforms like GitLab/GitHub are getting much better at squashing this malicious repository. They outline the chain and how the campaign is smart enough to differentiate between operating systems, making sure to maximize their victims who visit their ClickFix websites.

🔗 Open Source

Cyb3r-Monk/Microsoft-Vulnerable-Driver-Block-Lists

Cyb3r-Monk, a.k.a. Mehmet Ergene, whom I’ve featured extensively in this newsletter, has dropped a great resource for Windows Security administrators to help block vulnerable drivers. Microsoft apparently removed the webpage that lists these drivers and instead publishes a ZIP file with some XML data, making it more challenging for users to ingest and implement controls. Ergene automates this process with this repository and exposes some great metadata, allowing people to understand what they are loading into their Windows environments.

dis0rder0x00/obex

Yet another anti-EDR repository for tampering with security tools that you don’t want turned on :). The methodology here is that obex spawns itself as a process, and it attaches a debugger. The debugger hooks the LdrLoadDll function, patches a specific pointer, and uses it to catch DLLs being loaded and subsequently block them.

rotemreiss/malifiscan

With the number of open-source supply chain attacks occurring over the last few weeks, it’s encouraging to see open-source projects like this one come into play to help individuals understand the impact of malicious packages in their environment.

firezone/firezone

WireGuard-based, zero-trust networking project that allows users to deploy gateways and a management server for secure access. I love WireGuard, so I was already excited to read about this. The control plane/management server is written in Elixir, and all the gateways run in Rust, so you can expect this to be performant. They went the extra mile and created client applications for iOS, Android, Linux, Windows, and macOS.

Detection Engineering Weekly
DEW #129 - Malicious browser extensions, npm gets pwned (again) and AI weaponizing CVEs 17 September 2025 at 13:54

DEW #129 - Malicious browser extensions, npm gets pwned (again) and AI weaponizing CVEs

Detection Engineering Weekly

By: Zack Allen

17 September 2025 at 13:54

Welcome to Issue #129 of Detection Engineering Weekly!

I’m in NYC this week, and I underpacked, so I walked over to Hudson Yards to grab some T-shirts. I picked out two from Uniqlo, and when I got to self-checkout, I looked like a confused tourist, since you just “drop” your shirts into a bucket and it automatically finds the shirt and price. It was black magic
It’s been a busy few weeks at work with all of these supply chain-style attacks, and I’m sure a lot of you have been as well. But, I am continuously underwhelmed that these elegant package takeovers result in cryptominers and wallet stealers. If anyone wants to turn heel with me and go on a villain arc, the first thing I’d recommend is to stay away from cryptominers
I’ve begun to make small structural changes to the newsletter issues. I am removing italics, I changed how my From field looks on emails, and titles have a more descriptive sneak peek into the content. Don’t worry, I’m still keeping the snark and xkcd-style commentary throughout the issue, but this has already helped boost my open rates and engagement

📣📰🌐 Interested in sponsoring the newsletter and placing your ad right here?
I’m happy to see the engagement of folks reaching out to sponsor the newsletter. I have slots filling up for the rest of the year, so if you want to run an ad and get eyeballs and clicks from practitioners, CISOs and everything in between, shoot me an e-mail and let’s chat.
Sponsor Detection Engineering Weekly

⏪ Did you miss the previous issues? I'm sure you wouldn't, but JUST in case:

💎 Detection Engineering Gem 💎

Even if many plugins are fine, the bad ones are BAD by John Tuckner

As readers have seen in last week's issue, supply chain security affects the entire open-source ecosystem, which includes numerous registry-style marketplaces. For this post, a friend of the newsletter and security researcher John Tuckner shares a more in-depth look at how browsers manage the supply chain of extensions, and how we have a long way to go before we have complete visibility and detection opportunities on malicious extensions.

Unlike npm or pip, all modern browsers employ a sandbox that includes various security features. Memory protection, file system restriction, and process isolation are just a few of the many features of the sandbox, so it's tough for exploit developers to break out of the sandbox. But, browsers need to be modern, and any modern technology can extend its functionality, so this is where extensions come in.

Extensions have marketplaces, and the browsers can either officially own these marketplaces or have them as third-party registries. Anyone can write an extension and publish it, and while some marketplaces have more stringent requirements, according to Tuckner, you can side-load them just like a mobile app. This opens up a significant risk, and if you aren't careful, you can install an overly permissive app that can read and write to your computer in ways you may not want an app to do.

It's great that these guardrails help prevent a full breakout from an extension to the operating system, but it doesn't stop someone from willingly installing a malicious one. The expectation that an end-user can read and understand permission models and assess their maliciousness across any registry, such as a browser, open-source software, or IDE, is an impossible task.

🔬 State of the Art

Can AI weaponize new CVEs in under 15 minutes? by Efi Weiss and Nahman Khayet

If you've ever wanted to see how to solve a security use case with agentic systems, this is an excellent post by Weiss and Khayet on how to build and deploy one. They started with a pain point that we all suffer from: given a CVE, how fast can a researcher create and publish a proof-of-concept (PoC) exploit, and should we patch if that code makes it to the wild? It's a pain when someone releases a PoC, but I think that it helps create detection opportunities to validate impact. So, whether it comes from a researcher or their LLM agent, I'm happy to take in more data. Here's their workflow:

They focused their research solely on open-source packages. They used a combination of NIST and GHSA, and this type of structured data, alongside the patch diff, is an excellent source of data to feed into an agentic system to generate the PoC. They encountered some issues along the way, such as using a single, generalized agent for the full PoC lifecycle instead of multiple specialist agents. The other part I found pretty funny was when their agent was "refining" the PoC; the LLM focused on making the code work rather than ensuring it was vulnerable.

If I had to suggest more research into this area, I'd love to see folks take the PoC environments from their GitHub and instrument them to create the correct logs to generate detection rules. The time from CVE publication to PoC to detection rule coverage would be lightning fast and help at least some of us sleep better at night.

Automation for Threat Detection Quality Assurance by Blake Hensley

So many people ask, "How many rules do you have?" and never ask, "How are your rules doing?" Jokes aside, detection quality is a topic that is near and dear to my heart, and it's not talked about enough. Hensley is breaking that barrier, and what I love about this post is that he argues that detection quality isn't always about rule formats, linting, and emulating in CI/CD pipelines. It's also about ensuring your rules perform as intended in your live environment through experiments.

Hensley structures this post with several examples of detection quality assurance tests. You have unit tests and purple teaming emulation within the mix, but there are some really unique tests here I've never considered before. For example, Hensley's "results diff backtest" compares the output of a previous rule version's results with those of the new version. All super clever, and I'm adding this to my detection backlog.

The Present and Future of Managed Detection and Response by Migjen Hakaj and Amine Besson

This post offers a thorough overview of the Managed Detection & Response (MDR) market, aiming to address and answer the question, "What's next?" for companies in this space. MDRs evolved from the MSSP space, where most of their value proposition revolved around being "alert forwarders", as Besson and Hakaj put it. The market MDR providers discovered involved taking sprawling detection toolsets, unifying them, providing expert analysis, and then presenting the key information that matters to customers.

You can probably see why I liked this post, starting at the section …And what is my MDR doing to improve it? The differentiator, according to Hakaj and Besson, is detection engineering. Scaling detections as more data sources are added to a technology stack becomes the moat. Adding AI on top of this will disrupt the scaling efforts, for good or for worse. This is especially interesting if MDRs hire expert analysts, while their customers hire security generalists or engineers.

So, if you do partner with an MDR provider, press them on detection coverage and adding sources, while proving to you that they can be nimble in both areas.

Building a Detection Lab That Fits in Your Laptop by Joseph Gitonga

This is an excellent home lab tutorial for folks who want to get started with detection engineering and threat hunting. Gitonga approaches this lab targeting individuals who want to break into security operations without incurring expenses on cloud-based lab environments. You may need a lab machine that can be cost-prohibitive, but you can find beefy servers on eBay or build something with parts to meet the minimum lab machine requirement here. By the end of the lab, you'll have a Splunk SIEM, Active Directory environment, and a separate Splunk response and automation server.

The Cost of a Wrong Word in Threat Intelligence by Rishika Desai

I recently had a sit-down with a senior leader in my company to discuss how I can assist them with security challenges. This person has been with my company for years, both before and after the IPO, and is now leading a massive engineering organization. Whenever I meet with folks I don't work with often, my number one rule is to ask lots of questions. So I asked: Where has security given you the most pain? And he described threat intelligence in one request, as he wanted the risk context for what he was building, so he could do it securely.

Risk context is a much better name for threat intelligence. It's supposed to inform; whether you are a detection engineer building rules or a CISO looking at the latest threats, you can choose to use the information or not. Desai examines this concept, but from the perspective that the information may be incorrect. A missing word, clobbered threat actor names, or overly confident language can make or break a threat intelligence report.

I love this blog because it highlights the importance of risk context, and the context can be screwed up if you don't effectively communicate in your writing. This same context applies to detection rules, whether in the documentation or your response playbooks. Accuracy and clarity matter, and that includes whether or not you don't know something as much as what you know.

☣️ Threat Landscape

Geedge & MESA Leak: Analyzing the Great Firewall’s Largest Document Leak by Mingshi Wu

The big news this week from the People's Republic of China (PRC) is one of the largest document leaks related to the Great Firewall of China. I love leaks like this for several reasons. One, it gives insight into a culture that is vastly different from what we are used to in the West. Two, we get to see the technical implementations and architecture of a serious Internet censorship apparatus. If you set aside ethics, it's an impressive feat for a country with the world's largest population.

Wu is updating this page in real-time with findings, so it should update automatically as time passes. They also link net4people's GitHub issue tracker as they comb through the nearly 500 GB of data from GitLab, Confluence, and JIRA, so expect numerous findings in the coming weeks.

Inboxfuscation: Because Rules Are Meant to Be Broken by Andi Ahmeti

Permiso researcher Andi Ahmeti releases a Microsoft Exchange Inbox malicious rule creator and analyzer based on their research into threat actors abusing Inbox rules. Whenever I meet people in my day job who worry about advanced attackers using advanced techniques, I try to ground them back in situations like Ahmeti describes in this post. You may be a victim of a highly sophisticated adversary, but they would prefer to use tradecraft that is simple before resorting to their Rolodex of advanced attacks.

The most interesting aspect of this research is the exploration of Unicode obfuscation techniques and their associated detection opportunities. It reminds me a lot of malicious domain research, where using different character sets can confuse a rendering application (such as email) and not display the Punycode representation, thereby confusing the victim into thinking they received a legitimate domain to click on.

Uncloaking VoidProxy: a Novel and Evasive Phishing-as-a-Service Framework | Okta Security by Houssem Eddine Bordjiba

The Okta Security Research team uncovered a new phishing-as-a-service kit dubbed VoidProxy. The campaign they uncovered started from phishing lures from compromised email addresses on marketing platforms. The level of indirection in this particular kit is impressive. Particularly, they put a lot of time and energy into catching security scanners and researchers. They abuse Cloudflare's CAPTCHA turnstile and edge workers infrastructure to funnel potential victims into the phishing attack itself. It uses a standard Attacker-in-the-middle workflow to capture non-phishing-resistant authentication codes for the attack.

Bordjiba gained access to the panel itself, providing a unique inside look at how these kits are constructed, distributed, and managed by threat actors.

Ongoing Supply Chain Attack Targets CrowdStrike npm Packages by Kush Pandya and Peter van der Zee

The news of the npm supply chain attack from last week continues, this time targeting CrowdStrike's npm package library. The Dune-themed attack has a rather unique attack chain. Once you install the backdoored package, it'll download Trufflehog and extract secrets on your local machine or CI/CD environment. It also backdoors GitHub actions workflows with a file named shai-halud-workflow.yml. The peculiar part here is that the attacker publishes a public repository named Shai-Halud, which can be searched for across GitHub. Still, no one has (as of this post) figured out what these repositories do.

🔗 Open Source

Permiso-io-tools/Inboxfuscation

Powershell-based Microsoft Exchange rule exploitation toolkit that I linked above under Threat Landscape. It does some really neat stuff with Unicode manipulation to defeat traditional regular expression and word-based detections.

thekibiru03/splunk-ad-lab

Splunk and Active Directory repository referenced in the lab post above by Joseph Gitonga. Has a ton of out-of-the-box logging capabilities with Sysmon, PowerShell logging, and Audit policies. Also ships with some useful threat emulation tools, including Atomic Red Team, Disabled Defender, and fake users.

BlakeHensleyy/kql-tester

KQL testing repo from Blake Hensley's detection quality assurance piece under State of the Art. You provide it with a rule and your Azure credentials, and it'll perform a series of checks referenced in the blog to verify the efficacy against live data. You can adjust the query parameters or testing logic for your own detection risk tolerance.

magisterquis/sneaky_remap

Sneaky remap is a Linux defensive evasion technique that hides shared object files from detection. These shared object files can be used for persistence techniques, privilege escalation, and other defensive evasion operations, so relying on the Linux filesystem commands to look for peculiar shared objects can stop things like LD_PRELOAD and others in its tracks. The theory section here explains the algorithm itself, which relies on taking file-backed memory mappings, preserving their permissions, and then moving them to anonymous memory to evade detection.

lwthiker/curl-impersonate

This tool utilizes the cURL library to simulate modern browsers by replicating their TLS and HTTP handshake techniques. This differs from basic techniques like setting User-Agents or emulating a browser, as it relies on manipulating packets over the wire (Layer 7 HTTP handshakes or Layer 6 for TLS), which are library- or application-specific.

Detection Engineering Weekly
DEW #128 - AI Detection Engineering Uncertainty, 3D Threat Hunting and Salesloft Drift Shenanigans 10 September 2025 at 14:03

DEW #128 - AI Detection Engineering Uncertainty, 3D Threat Hunting and Salesloft Drift Shenanigans

Detection Engineering Weekly

By: Zack Allen

10 September 2025 at 14:03

Welcome to Issue #128 of Detection Engineering Weekly!

✍️ Musings from the life of Zack in the last week

I’ve had to cancel work travel to Datadog’s Paris Office due to an Air Traffic Controller strike
I’m very close to canceling ChatGPT and using Claude exclusively. It’s an excellent threat research and engineering co-pilot
You’ll read more about this below, but I’m getting more freaked out about developers as targets by threat actors. Attacks against dev tooling ecosystems and macOS are increasing, and both of these target sets are a mainstay for many firms’ operations

⏪ Check out last week’s issue if you’ve missed it!

This Week’s Sponsor: Material Security

Fortify Your Google Workspace, from Gmail to Drive.
Protect the email, files, and accounts within Google Workspace from every angle. Material Security unifies advanced threat detection, data loss prevention, and rapid response within a single automated platform so your lean team can do more with less. Deploy in minutes, integrate with your SIEM, and let “set-it-and-forget-it” automation run 24/7.
Gain enterprise-grade security without enterprise overhead.
Simplify Your Google Workspace Security

💎 Detection Engineering Gem 💎

The experience of the analyst in an AI-powered present by Julien Vehent

When I think of detection engineers, I think of three disciplines: software engineering, security analysis, and statistics. I wrote about the first two of these disciplines in Field Manual #1, and started touching on statistics in Field Manual #3. The Detection Engineering Mix is born of evolution and the necessity of a business. With limited human capacity paired with rapidly developing technology, you’ll need to adjust these three knobs in different configurations to prevent ticket queueing and alert fatigue.

What happens when you add a fourth circle to the Detection Engineering Mix? It would seem overwhelming and infeasible because the expectations of software engineering, security expertise, and statistics set the bar reasonably high. This is where the AI element comes into play. If you had told me it was a necessity two years ago, I would have told you to kick rocks. Now I’m not so sure.

In this post, Vehent offers his commentary on the uncertainty surrounding the integration of AI engineering and automation into our collective approach to threat detection. His issue with the mix, AI or not, is that new security engineers overemphasize aspects other than security expertise. This makes sense in some ways because software engineering is how we are supposed to “scale”, and it’s a hard requirement for security positions in more modern organizations.

He argues that AI will move us further away from the security expertise circle above, and we’ll lose the analytical rigor along with it. Detection and response teams should have their work cut out for them to triage and investigate, as it helps them experience the pain that software engineering, data science, and now AI can help solve. If they don’t feel the pain, how can we know when a security problem is solved?

🔬 State of the Art

Can't Hide in 3D by Certis Foster

Time-Terrain-Behavior is a threat detection modeling framework developed by MITRE that examines how three types of metadata from security telemetry can help pinpoint compromises. It’s a neat paper, so if you have time, get a cup of tea/coffee/drink and some other hygge items and check it out. These dimensions combine three alerting capabilities that we typically use in an atomic sense. Time is the baseline for when the machine is active. Terrain examines security tooling that generates observations on data from the machine, and behavior baselines for benign workstation activity.

Foster’s approach here is to apply the concept of TTB to a real dataset and assess its performance. They took Splunk’s Boss of the SOC (BOTS) Frothly dataset, loaded it into a SIEM, and laid out a blueprint to implement and find compromises using TTB. They broke down each TTB component into three sets of labels. Time{Morning, Evening, Weekend}, Terrain {Windows, Network, Cloud} and Behavior {Authentication, Execution, Access} all as distinct counts. Each distinct label was categorized even further, they plotted workstations out using some incredible SPL queries and got this:

This identified one station as an outlier, and Amber was the one that was compromised! No detection rules, just some clever labeling, intuition after reading the TTB paper, and security expertise. I’m a big fan of this type of vectorized approach to threat detection, though I see a few problems that Foster also addresses:

They can be cost-prohibitive if you have 1000s to 10000s of assets to protect
They lose specificity when an actor does something during core hours, so you’ll see a larger distance on sensor hits, but it may underfit on other components like time
Sensor hits here can also mess up specificity if your tooling is inadequate

I do love this mathy approach!

AWS CloudTrail Event Cheatsheet: A Detection Engineer’s Guide to Critical API Calls — Part 1 by Muh. Fani (Rama) Akbar

This piece contains a comprehensive compendium (alliteration, woo!) of security and detection-relevant AWS API calls for folks who want to learn more about AWS detection engineering, the most critical and cost-effective way to do it via AWS CloudTrail. CloudTrail is the control plane log source for AWS, and control plane events refer to administrative actions in this context. Administrator activity, Role creation, and Secret and Key creation are all examples of administrative actions.

So, Akbar took this a step further and mapped out as many security-relevant CloudTrail logs as possible across the MITRE ATT&CK chain. Each MITRE section contains the key events, attacker activities using the AWS CLI, and the relevant SQL detection rule. The reason I love these types of blogs is not just for the educational content, but for the provable elements of the detection. You can run the attack commands on the red team side and observe how they log on the blue team side.

Leveraging Raw Disk Reads to Bypass EDR by Christopher Ellis, Andrew Steinberg and Austin Munsch

This piece by the offensive security team at Workday exposes how to leverage raw disk reads to the harddrive to bypass controls and monitoring from a modern EDR. The most interesting aspect of this research is that it extends beyond the Windows Operating System (OS) and directly into the process of building your own driver to interact with physical devices. Windows contains several layers of APIs in user-land and kernel space. Some of these layers are a result of compatibility, which Microsoft obsesses over, so that you can ship code and have interoperability with their ecosystem.

These APIs are what EDRs monitor. Function hooking via Kernel callbacks, ETW monitoring, and system filter drivers provides EDR vendors with several ways to detect malicious activity in a process, files, or a user session. But if you can connect to a driver directly, you can avoid these mechanisms and, in this case, read data from sensitive files.

Ellis, Steinberg, and Munsch did just that. If you have a BYOVD or a user with permissions to access the driver, you can run their PoC, which performs the raw disk read, parsing and searching for something like the Windows SAM file; you can do a fileless read without triggering an EDR. Here’s a mermaid diagram I created with Claude to visualize the attack flow:

Software Development Nuggets for Security Analysts Part 2: The Browser by David Burkett

This is a continuation of a blog post from David’s Part 1, released almost three years ago. I love the framing of trying to understand a security concept by first understanding why a software developer, or in this case, a critical piece of software like a web browser, would implement their code in a particular manner. In this case, Burkett studies why browsers are an attractive target for threat actors. But, to understand why it’s appealing, breaks down the threat model of their architecture.

The threat model revolves around how a Browser stores and protects sensitive data, such as cookies, session, and other browser artifacts like extensions that store this type of data. These have to be stored somewhere, so he uses Velociraptor, a forensics tool, to analyze file directories and file names where a threat actor may attempt to read and exfiltrate that data.

☣️ Threat Landscape

When interesting threat landscape news emerges that warrants a deeper discussion, I typically post it under Threat Landscape. Still, I minimize the amount of analysis I provide due to space constraints. I’m going to try something a little different here and go deep on stories from time to time.

⚡ Emerging Spotlight: Salesloft Breach and the Victimology of Developer Tooling and Environments

Update on Mandiant Drift and Salesloft Application Investigations by Salesloft

Myself, like many other detection and threat intel folks, have availability bias. We apply our analysis of the news stories and incident details based on the available information, and in detection-speak, that falls within our expertise and the sources we typically build detections around. Without news stories or internal incident data, it’s challenging for us to determine whether something is worth defending.

I posted the Drift compromise details in my last issue, but what I’ve linked here is their investigative notes on the breach timeline. I said this on my LinkedIn post, but it reads like a modern red team report. The initial access is unknown, but the actor gained access to Drift’s GitHub organization, performed reconnaissance, and established persistence. They exfiltrated their codebase and then pivoted to their AWS environment, where they extracted OAuth secrets from their customers with the Drift integration. This is where the threat actor began accessing Drift customers and extracting additional data, including Salesforce information, extra secrets, and customer data.

My availability bias here is that this is the first time this has ever occurred, with pivoting techniques affecting GitHub, AWS, and an Application during the compromise chain. I’m amazed and terrified because it’s showcasing that the new victimology, which threat actors target more and more, is developers and their tools.

Below is the breach diagram I made with Claude and posted on social media.

⚡ Quick Hits

NPM debug and chalk packages compromised (HackerNews post)

A single NPM maintainer had their NPM account compromised, resulting in the backdooring of 18 packages with over 2 billion weekly downloads. This is the HackerNews post (his reply is the first one), and it reaffirms my point in the above spotlight post: Threat actors are targeting Developers and their tools.

Contagious Interview | North Korean Threat Actors Reveal Plans and Ops by Abusing Cyber Intel Platforms by Aleksandar Milenkoski, Sreekar Madabushi (Validin) and Kenneth Kinion (Validin)

SentinelOne and researchers from Validin identified a cluster of DPRK actors leveraging Validin to hunt for their command and control (C2) infrastructure in Internet-wide scan data. The amusing part here is that as they set up monitoring and queried these platforms to identify their infrastructure, they in turn provided Validin and SentinelOne with the queries to locate their infrastructure.

s1ngularity's Aftermath: AI, TTPs, and Impact in the Nx Supply Chain Attack by Rami McCarthy

Something is in the air regarding open-source software supply chain attacks. McCarthy does an excellent job summarizing the s1ngularity security incident, which involved the theft of an npm publishing token via a malicious pull request on GitHub. Users who downloaded the latest version had their secrets stolen; many had their private repositories uploaded to GitHub as public, and a peculiar technique of using AI to identify interesting files was employed.

🔗 Open Source

andrewkolagit/DetectPack-Forge

Neat full-stack application that uses Gemini and n8n to turn natural language into Detection rules. You’ll see many companies offering this type of service, but it’s impressive to see something implemented and shared on GitHub.

almounah/orsted

Yet another for educational use, post-exploitation, and C2 framework. Their docs site looks promising in terms of how it works, so it’d be good to add this for detection backlogs to check for telemetry.

DeepSpaceHarbor/Awesome-AI-Security

awesome-* list this time for AI Security resources. This list is much more academically focused, but you’ll find a few blogs and code examples in there too.

tclahr/uac

Unix-like Artifact Collector (uac) is a forensic tool designed to collect breach artifacts across various flavors of Unix and Linux. It’s modular, allowing you to add artifact collections via configurable YAML files. It’s no install, meaning the core functionality is a big ol’ shell file, which means it can run on everything from NAS boxes to IoT devices.

Detection Engineering Weekly
DEW #127: SOC Visibility Triad, Feedback loops in detection, PowerShell detection ideation 3 September 2025 at 13:39

DEW #127: SOC Visibility Triad, Feedback loops in detection, PowerShell detection ideation

Detection Engineering Weekly

By: Zack Allen

3 September 2025 at 13:39

Welcome to Issue #127 of Detection Engineering Weekly!

✍️ Musings from the life of Zack in the last week

I’ve bought a subscription to Claude and I’ve really enjoyed using it over ChatGPT. I feel like it’s more concise in my asks, and it does a great job thinking like an engineer
Thank you all who reached out asking about sponsoring the newsletter. More on this soon but it looks like this is gonna be something I can do which is exciting :)
Growing this newsletter means more and more to me as I write. I’ve really REALLY enjoyed writing lately. I owe y’all several new issues of the Detection Field Manual, so stay tuned!

⏪ Check out last week’s issue if you’ve missed it!

💎 Detection Engineering Gem 💎

SOC Visibility Triad is Now A Quad — SOC Visibility Quad 2025 by Anton Chuvakin

When was the last time you read a blog series that started 10 years ago? Anton’s original post explained the “SOC visibility triad,” which contains core telemetry pillars that every security operations team needs to have visibility into. Remember, there is no rule without telemetry, so it’s essential to understand the highest-value telemetry in your network to write rules against. According to the original post linked above, the triad includes logs, network, and endpoints:

Arguably, every piece of technology in modern stacks can fit into one of these three buckets. And this makes sense: network and host deserve their own buckets, and SIEM can be a catch-all source as you lob whatever log source you can at it. However, many things changed between 2015 and 2020.

Anton revisited the triad in 2020, and although hyperscalers like AWS or GCP were skyrocketing in popularity, he did not add another leg to the triad. I think the spaces were nascent to threat actor activity at the time. In 2025, I don’t believe this is the case anymore!

Over the last five years, SaaS software has become an integral part of modern business, backed by hyperscalers. As a result, applications have become so complex that they require their own dedicated threat models. For example, understanding that Cloud platforms like AWS are operating systems themselves means they have sufficient complexity for threat actors to exploit, and the concept of Application Detection & Response (ADR) is entering the security lexicon.

So, the triad becomes a diamond, and application telemetry should now be part of your security operations umbrella. Now I hope Anton invites me on his podcast to debate this next :).

🔬 State of the Art

Malicious Encoded PowerShell: Detecting, Decoding & Modeling by Alex Teixeira

I am amazed by the complexity behind threat detection strategies, where detecting something seemingly simple becomes extremely difficult. This post is an excellent demonstration of that theme, where Teixeria discusses a detection opportunity of malicious PowerShell using the EncodedCommand argument. It’s one of those nuanced problems where a 1-hour interview discussion can focus solely on detecting malicious PowerShell usage.

He first starts with the basics of the technique: using the EncodedCommand parameter, which leverages base64 encoding to pass into the PowerShell executable. Encoding and decoding base64 in an executable should raise some eyebrows for security engineers, but it’s more common than we all think. Still, attackers use it for defense evasion and it can get difficult trying to differentiate benign and malicious usage.

The interesting aspect of this detection post is identifying all the variations of EncodedCommand flags. Teixeria calls out the parameter abbreviation “feature” of PowerShell, where you can pass all kinds of permutations, such as -en, -e, -encode and so forth. He builds some gnarly regular expressions to capture this, but keeps building out the complex edge cases until it feels, at least to me, untenable. The critical lesson under the 'Which logs to use?' section is that detection models should focus on detecting obfuscation, not encoding, as they can sometimes be conflated.

Echo Chambers — Feedback Loops in Detection Engineering by Nasreddine Bencherchali

Feedback is a gift. Acting on feedback is a way to utilize that gift in the best possible ways. But this post isn’t necessarily about getting feedback from your boss or loved ones, but rather, it’s about the diversity of feedback you receive on your detection rules. I love this concept because it frames detection, ideation, and maintenance under the threat of bias when you receive feedback.

In this post, Nas describes “The Echo Chamber Effect” of detection feedback. There’s nothing better than the dopamine hit of deploying a great detection rule, but will that dopamine and positive reinforcement keep that rule excellent in the long run? How about when you receive negative feedback, and you tune the rule so much that it’s brittle? What about chasing 100% MITRE coverage?

All of these feedback loops are pitfalls. Humans tend to focus on the immediate pain or pleasure of a response, which can lead us down a rabbit hole of confirmation bias. So, to combat the Echo chamber, Nas introduces several strategies to break out of it. These strategies are very similar to threat intelligence analysis, where analysts employ various techniques to challenge their biases and ensure their conclusions are sound.

GraphApiAuditEvents: The new Graph API Logs by Bert-Jan Pals

Microsoft recently released a GraphApiAuditEvents table for DefenderXDR, so if you are a customer of the product, you get this table for free. And by free, it doesn’t charge for ingestion or storage costs, which makes it a nice add-on for those looking to set up out-of-the-box alerting for Azure Entra. Pals compares this new table to the existing MicrosoftGraphActivityLogs for Sentinel. You receive the same event types and similar counts of log ingestion for the same events. However, GraphApiAuditEvents has fewer fields (19 compared to 33) and appears to be missing two critical fields related to Device and Session ID. Both of these are crucial for detecting credential stuffing and account takeover attacks, so you may need to incorporate other post-compromise mechanisms into your KQL rules to compensate.

Refinery raid by Nick Foulon

Have you ever wanted to set up an oil plant? I think I’d like that experience at least once in my life. I imagine the startup costs are pretty high, and you’d need lots of permits, permissions and there’s the whole thing about killing the planet you’d have to navigate, but I imagine the operational complexity is fascinating.

Jokes aside, doing security for an oil plant is the most interesting part for me. Luckily, Operational Technology (OT) security is becoming more mainstream for cybersecurity people, so it’s cool to see a blog post on how to setup and attack a Virtual Oil Plant. Foulon walks readers through setting up the Labshock Oil Plant environment, and it’s basically an Oil Refinery in a docker container.

It’s pretty terrifying that you can emulate connecting to a PLC and start turning pumps on and off. I hope operational environments aren’t this easy to attack!

Foulon ends the blog highlighting some of the critical vulnerabilities you explore during the lab, and sets readers up nicely for Part 2 with some defensive strategies.

☣️ Threat Landscape

Widespread Data Theft Targets Salesforce Instances via Salesloft Drift by Austin Larsen, Matt Lin, Tyler McLellan and Omar ElAhdan

The big threat landscape news from this past week involved a Salesforce integration and app company, Salesloft, suffering a data breach from UNC6395. According to Google’s Mandiant, UNC6395 compromised Salesloft and pivoted to their integrations with Salesforce. They exfiltrated shared secrets, such as OAuth app and secret keys, as well as other stored data, including cloud and API keys. They provide helpful hunting tips and indicators of compromise to help others investigate.

Amazon disrupts watering hole campaign by Russia’s APT29 by AWS

AWS Threat Intelligence released a blog detailing their disruption of Midnight Blizzard / APT29. This group is also known as Cozy Bear for those who want to harmonize. The team identified domain names used by APT29 that leveraged watering-hole techniques to redirect victims to attacker-controlled infrastructure, which was then used for performing phishing attacks utilizing Microsoft’s device code authentication flow.

Detecting and countering misuse of AI: August 2025 by Anthropic

This is a pretty crazy threat intelligence update from the security team at Anthropic. They uncovered a campaign by a threat actor who successfully ransomed and extorted 17 victims and used Claude to help them move through the attack chain. This is the excerpt from the Anthropic team:

The actor used AI to what we believe is an unprecedented degree. Claude Code was used to automate reconnaissance, harvesting victims’ credentials, and penetrating networks. Claude was allowed to make both tactical and strategic decisions, such as deciding which data to exfiltrate, and how to craft psychologically targeted extortion demands. Claude analyzed the exfiltrated financial data to determine appropriate ransom amounts, and generated visually alarming ransom notes that were displayed on victim machines.

Storm-0501’s evolving techniques lead to cloud-based ransomware by Microsoft Threat Intelligence (MSTIC)

Ransomware actors are becoming increasingly proficient in leveraging their access to pivot into cloud environments. In this post, MSTIC details an intrusion they helped respond to for Storm-0501. The group got access to a victim environment, elevated privileges, and then pivoted to several of their Azure tenants. They used AzureHound for reconnaissance and attack path mapping until they found a non-human identity with Global Admin access. They exfiltrated data from Azure storage accounts and deleted as much as possible from those same storage blobs.

🔗 Open Source

MorDavid/BruteForceAI

LLM powered credential stuffing tool. You point it at a login page and it’ll determine the login endpoint and it’ll try to brute force credentials to find valid accounts. You can use a locally hosted model and a Chromium browser and its smart enough to try to rotate several features to avoid bot protection mechanisms.

Agentity-com/mcp-audit-extension

This is a great project that proxies MCP client/server communication using a VSCode extension. Lots of MCP interactions operate within IDEs specifically, so logging there makes total sense. You can look at the calls to MCP servers and send it to a SIEM or logging infrastructure. Here are the fields it logs in JSON format.

mandiant/flare-floss/releases/tag/quantumstrand-beta1

Mandiant released a tool for malware analysts and reverse engineers that helps find, tag and present interesting strings in a binary sample. The README says they want to provide “deep context” around strings in a tree-like manner so you can circumnavigate the binary without losing track of why certain string values exist in a region of the code.

dockur/windows

Run a full Windows OS directly in a docker container! This is a fun one because it lists out Windows OSes all the way back to Server 2003, which is a whopping 600 MB. This might be a fun one to run with threat emulation pipelines and for detection ideation.

pwnfuzz/diffrays

IDA pro extension to optimize binary diffing. This is especially useful if you want to reverse engineer patches in operating systems and see where vulnerabilities were fixed and build payloads to exploit older versions. These are especially fun when you don’t have access to anything but the OS images.

Detection Engineering Weekly
Det. Eng. Weekly #126 - live laugh logs 27 August 2025 at 14:03

Det. Eng. Weekly #126 - live laugh logs

Detection Engineering Weekly

By: Zack Allen

27 August 2025 at 14:03

Welcome to Issue #126 of Detection Engineering Weekly!

📣📰🌐 Now accepting sponsors!

The newsletter is growing, and I’m excited to announce that I am accepting sponsors to place ads in issues of Detection Engineering Weekly.

You’ll notice some Issues containing a sponsored ad section at the top of the newsletter. Each Weekly is always free, so nothing changes there.

I have some constraints on who I work with, so if you are interested in collaborating with me, please send me a message via email by clicking the button below.

Sponsor Detection Engineering Weekly

⏪ Did you miss the previous issues? I'm sure you wouldn't, but JUST in case:

💎 Detection Engineering Gem 💎

The Tyranny of False Positives by Michael Taggart

I’ve battled my whole career with false positive alerts. The visceral allergic reaction that these cause to analysts, detection teams and customers can make you feel like the worst person in the world. It’s like taking a production database down (which I’ve also done), but instead of booting it back up, you are reminded that your customer will never get that time back, so it feels way more existential.

Detection efficacy, which I discussed extensively in Field Manual #3, is my attempt to describe the usefulness of an alert based on its accuracy, triage capacity, and risk tolerance. But, after reading Taggart’s blog on efficacy, I think I’m a believer in their proposed Admiralty-code-like approach to efficacy. And, Taggart is talking about game theory, so I guess I owe him a beer if we ever meet at a conference.

What you see above is Taggart’s approach to the usefulness or efficacy problem in Detection Engineering. It’s somewhat of a homage to One-shot games, with the idea that you perform one action-to-investigation traversal per alert. However, note that the outcomes aren’t TP/FP/TN/FN labels; instead, they are actions taken after an investigation. This framing makes it much easier for detection teams and leadership to understand the value of alerts, rather than relying on ratios and MITRE coverage.

At the end of the post, Taggart proposes an Admiralty code-like labeling system with a sprinkle of CVSS to help capture these path traversals. It’s interesting in the sense that it fills a measurement gap in a much higher dimensionality than TP/FP. Still, it can definitely be harder to parse for the normies outside of detection circles. Even so, taking this labeling system and encoding it into higher-level metrics can likely provide detection and incident teams with trends on their alerting strategies.

🔬 State of the Art

Architecting Your Detection Strategy for Speed and Context by Jack Naglieri

The most significant jump in understanding I have found with detection engineers is when they start differentiating alerting strategies based on the behaviors they aim to catch. For example, conceptually understanding that a connection to a known malicious domain should result in an immediate alert and a subsequent blocking action happens quickly in an operational environment. But, what if the domain is not marked as malicious, yet the traffic looks suspicious? What does suspicious actually mean in this context?

Jack’s blog explores these concepts in detail. Detection strategies emerge when the behavioral or contextual information becomes a critical decision point for finding threat actor activity. Atomic detections, according to Jack, are akin to the idea of the malicious domain. Still, other behaviors, such as malicious cronjobs, cURLing an unknown binary on a user session, or adding malicious email forwarding rules, all make sense to alert in isolation.

More sophisticated behaviors or malicious behavior that looks benign are when this gets complicated. Risk-based alerting strategies can help with this, but adding contextual (who is this person? Should they be using the command prompt while being in Finance?) and historical (does this engineer usually log in to all production databases off hours?) creates a more robust approach to threat detection.

Jack goes into deep detail about these strategies and provides numerous excellent examples, both in scenarios and in rules, for others to follow. At the end of the blog, he makes some excellent points in how LLM and agentic approaches to investigation alleviate this problem even more, and it tops it off with an example playbook for readers to take home and try in their own environments.

What Framing Security Alerts as a Binary True or False Positive is Costing You by David Burkett

This False Positive blog by Burkett offers an excellent perspective that should be combined with the reading from Taggart. I like how he frames false positives from a multi-environment point of view, rather than focusing on a single organization. The example he uses revolves around an MSSP that deploys a rule to monitor remote monitoring and management (RMM) tools:

Customer A uses RMM Tool AnyDesk for normal IT operations.
Customer B uses RMM Tool ConnectWise for normal IT operations.
Your rule triggers whenever AnyDesk is detected.
For Customer A, the alert is technically correct, your detection fired on exactly what it was built to detect. Under SDT, that’s a true positive. But operationally, it has no security impact because it’s part of daily business activity.

For Customer B, the same detection could be suspicious or outright malicious. Same rule. Same accuracy. Very different impact.

This scenario demonstrates a key concept he explains in the next section: intent. If a rule is accurately catching AnyDesk but Customer A uses AnyDesk, then the intent was captured and isn’t relevant for Customer A. If the intent of the rule is to catch malicious use of RMM tools, then Customer B would benefit from this rule because they don’t use ConnectWise or any RMM for that manner. Intent, according to Burkett, captures the full lifecycle of an alert from generation to investigation. This inclusive approach helps teams tune rules to their environmental context rather than the technicalities of catching behavior.

The Fragile Balance: Assumptions, Tuning, and Telemetry Limits In Detection Engineering by Nasreddine Bencherchali

Detection writing is a practice of informed assumptions. I love this framing by Nasreddine, because it captures the detection engineering condition that lots of us live with everyday. In this blog, tuning is front-and-center as his approach to solving false positives. But tuning is more than just adding suppression lists or changing a field in a query. For example, typical behavior of a user or an environment can influence how you tune the rule. Another example is the lack of telemetry: do you have all the necessary context before you can make an alerting decision? He lists several fields from Sysmon Event ID 11, but those fields may not give a detection engineer enough information to make an informed decision.

He ends the blog with my favorite section, “Choosing the Right Fix.” Basically, it boils down to the capacity of the security team and the detection engineer. Was the assumption on the behavior too uninformed or not tailored to the environment? Is the risk tolerance for this specific event so low that it needs to be aggressively filtered? This is a fantastic tactical look at false positives because it helps give us a methodology behind the problem.

Using Auth0 Event Logs for Proactive Threat Detection by Maria Vasilevskaya

The Auth0 team at Okta just launched a new detection rule set to help protect Auth0 customers. I love the framing of their release because it focuses on several personas who could use these detections. I’ll link the full repository below, but if you take a look at the rule Refresh Token Reuse Detection, the team put in excellent metadata. The explanation, comments and prevention fields have plenty of commentary for folks to internalize these detections and understand the intent behind each one.

Detecting ClickFixing with detections.ai — Community Sourced Detections by mikecybersec

It’s cool to see how community sourced detections are coming together on a platform other than GitHub! In this post, mikecybersec explores how the community website, detections.ai, can be leveraged when reading a threat intelligence report on ClickFix. Detection ideation and research is one of the most fun parts of detection engineering, and in this specific blog, mikecybersec references MSTIC’s latest ClickFix blog as the their source of inspiration for finding rules. They search through the community portal for ClickFix, pick out some candidate rules, added their own tuning and began testing.

☣️ Threat Landscape

iOS 18.6.1 0-click RCE POC by b1n4r1b01

The big vulnerability research news this last week is Apple’s out of band security patch for CVE-2025-43300 0-click RCE vulnerability. An attacker can build a specially crafted picture leveraging Apple’s image decompression library and achieve remote code execution on a victim device. It was hard finding technical details on this one, but several news articles linked to b1n4r1b01’s implementation here. I couldn’t verify the PoC here.

Falcon Platform Prevents COOKIE SPIDER’s SHAMOS Delivery on macOS by Maddie Stewart, Suweera De Souza, Ash Leslie, and Doug Brown

Who would win in a fight, a trillion-dollar company named after a fruit, or one piece of malware leveraging a password prompt to bypass GateKeeper? Researchers at CrowdStrike uncovered an extensive ClickFix campaign targeting macOS victims to deploy a version of AMOS Stealer. The campaign leverages a combination of malvertising and social engineering techniques to get victims to run a malicious bash command, which subsequently installs the malware. This leads to infections that steal secrets and provide initial access, similar to other infostealer families on Windows.

ghrc.io Appears to be Malicious by Brandon Mitchell

I am both impressed and terrified by how creative attacks emerge from the software supply chain ecosystem. This ecosystem accelerates to keep up with the demand of developers shipping software, and with that acceleration, leaves open all kinds of interesting attack paths inside a network. Mitchell found an attack path leveraging an old school attack that appears to still work to this day: typosquatting. This typosquat domain abuses a keyboard slip that can result in a compromise via a malicious container from GitHub’s container registry.

How Spur Uncovered a Chinese Proxy and VPN Service Used in an APT Campaign by Spur Engineering

I linked a story two issues ago about a leak of DPRK operations from a security researcher. The Spur team took that leak and combed through for Kimsuky infrastructure to see if they could identify and fingerprint any type of proxying or VPN service leveraged in their campaigns. This is a great example of a threat intelligence pivoting blog that started with a single IOC and uncovers a large swath of infrastructure used to carry out attacks.

Agentic Browser Security: Indirect Prompt Injection in Perplexity Comet by Artem Chaikin and Shivan Kaul Sahib

Vulnerability writeups against Agentic products reminds me a lot of the early days of security where buffer overflows were the norm. These are now taught in “basic” security courses, and it’s almost laughed at because there are so many protections against taking untrusted input into a privileged context seems like a given. This same class of vulnerabilities from two decades ago are now rearing their ugly heads into the implementation of LLMs in everyday tools.

Brave researchers found this exact problem in Perplexity’s Comet browser. They managed to achieve prompt injection of Comet’s LLM handler, where a specially crafted webpage (read: just a freakin’ prompt) can influence how Comet behaves. Their PoC involves doing some scary things with a personal email address and Comet’s login page.

🔗 Open Source

DataDog/ghbuster

GitHub enumeration and investigation tool built by my colleague Christophe Tafani- Dereeper. You provide a GitHub username, and it’ll use several OSINT heuristics within the repository and GitHub as a whole to uncover intelligence around the account.

ammarion/Take_Home_Exercise

This is a cool interview take home exercise from the Adobe security team. I’ve gone back and forth on take home exercises, because they can sometimes be used to do free work for the company, but this one looks like an interesting and timeboxed challenge.

auth0-customer-detections

Detection rules from the auth0 announcement above. It has excellent documentation, both in the rules and in the README on the main page. You can run Sigma over all of these rules to convert to several formats.

shell-dot/tuoni

Yet another open source post-exploitation toolset. Of course, it says for educational use only, though you can never trust the bad guys using open source tools just to learn.

Detection Engineering Weekly
Det. Eng. Weekly #125 - I'm the Miss Rachel of Threat Detection 20 August 2025 at 01:58

Det. Eng. Weekly #125 - I'm the Miss Rachel of Threat Detection

Detection Engineering Weekly

By: Zack Allen

20 August 2025 at 01:58

Welcome to Issue #125 of Detection Engineering Weekly!

⏪ Did you miss the previous issues? I'm sure you wouldn't, but JUST in case:

💎 Detection Engineering Gem 💎

Build a Time-Based Threat Detector with Lambda and Athena by Rich Mogull

My friend Rich wrote an excellent post on building sliding query window detections using AWS services. I saw it on social media, added it to my queue, and lo and behold, he also gave this newsletter a shoutout! I need to return the favor, so please go checkout his Newsletter/Blog Cloud Security Lab a Week.

The cool part about Rich’s blogs is that he breaks posts down into two sections: theory and labs. The theory in this post revolves around sliding window detections and how you can get around the Halting Problem using overlapping windows of log inspection. Essentially, you need to do a batch based lookup of logs to find indicators of maliciousness, but you run the risk of missing a window of logs if you do sequential windows. Here’s a visual from his blog:

This reduces the risk of missing critical logs you get when doing sequential windowing. I took Rich’s windowing visual and tried to show it using Mermaid (and so it can fit inside the newsletter!)

Notice here that the Windowing period is 20 minutes. Each CloudTrail event flows into the pipeline, and A, B and C all arrive in 12 minutes or less. Event D, however, is at 23 minutes, which means it’s outside the window and the alert is scrapped. The alt box shows what would happen if it arrived in 20 minutes or left (alert).

The Lab section uses a CloudFormation stack to boot up all of the necessary infrastructure, and you can see, feel and click on the infrastructure, logs and alerts to see how it all works under the hood. He also has a video walkthrough if folks prefer a more visual walkthrough.

🔬 State of the Art

From Telemetry to Signals: Designing Detections with an Audience in Mind by Nasreddine Bencherchali

Nas is SO back.

And by back, I mean cranking out highly relevant posts discussing super insightful topics, so make sure to go check out and subscribe to his Medium. In this post, he explains the nuances behind the customer archetype of a detection rule. Security Operations departments and teams all look different. Some have a massive scale with humans, technology and several departments looking at various parts of an attack chain, while others have 4-5 Detection & Response engineers responding to everything. Nas calls this “Schrödinger’s Detection”.

With this in mind, does it make sense to write one rule that can fit the needs for the following “archetypes”?

Incident Responder
SOC Analyst
D&R Engineer
Software Teams owning the Detection and SIEM Infrastructure

Nas outlines the needs of each of these archetypes and the incentive structure behind their security work. For example, if you ship a rule that casts a wide net and looks for an IP address, a small team of 5 D&R Engineers or a SOC will yell at you. But this could be super useful for an Incident Responder. I’ve hit this concept hard with my detection teams to make sure that we think about the consumer of our detections first, and then we can think about the usefulness of what we ship into production.

This is an apt quote from the blog on setting expectations:

This is why setting expectations is technically not optional, it’s the difference between a detection being understood and valued versus misunderstood and scrapped.

DFIR Artifact: PowerShell Transcripts by Eric Capuano

PowerShell Transcripts are like a “flight recorder” for PowerShell, as Eric puts it. Since PowerShell is so heavily used in threat actor and system administrator operations, its logging their execution details and associated metadata is essential for DFIR engagements.

Eric walks readers through the configuration and deployment options for transcripts and shows several examples of what the transcripts look like and how to parse them. Lastly, he adds my favorite part of any detection blog: detection opportunities!

The Hidden Gaps in Microsoft’s New Linkable Token Identifier by Mehmet Ergene

Mehmet’s latest update on new Microsoft Threat Detection products sheds some light on their latest feature, linkable token identifiers. The basic idea behind the identifiers is that Entra Security Logging has more metadata around particular actions within the SignIn workflow, and it can help link together events using the same session ID. This is useful for detecting and shutting down Adversary-in-the-Middle (AitM) phishing attacks.

According to Mehmet, this helps with AITM attacks, but leaves out other, more recent phishing-style attacks against Entra/Azure. He goes explicitly into two types of phishing TTPs: device-code phishing and Token Stealing. I had ChatGPT build two Mermaid diagrams to visualize both, but make sure to go read the “why” and detection opportunities from his blog.

FortMajeure: Authentication Bypass in FortiWeb (CVE-2025-52970) by BigShaq

I love vulnerability write-ups that use an exploit that attacks a cryptographic implementation to nullify a cryptographic primitive. BigShaq was poking holes at FortiWeb and found an auth bypass where attackers craft a number between 2 and 9 (lol) on an attacker-controlled parameter and force a cryptographic key to be all 0s. Since the key is all 0s, they were able to sign an HMAC message that matches the key (all 0s), and get a FortiWeb webshell.

How XProtect’s detection rules have changed 2019-25 by Howard Oakley

XProtect is Apple’s defense-in-depth technology that remediates and blocks known malware after it executes on a macOS device. It comprises of YARA rules that act as generic signatures for common malware patterns, so when it detects something on the device, it’ll block execution and move it to the Trash. Apple stores these rules in a single text file, and Oakley has been tracking Apple Security Engineers’ maintenance of this file for several years.*

The number of YARA rules has increased to nearly 400 in the latest release, and according to Oakley, it appears that they haven’t removed any rules affecting older versions of macOS. This is especially interesting when you consider Apple’s switch to its Silicon processors, which means not every device can run malware developed years ago. This ballooning of rules has affected how long it takes Apple’s GateKeeper to run, so hopefully they trim the size of this ruleset, or malware will probably just beat GateKeeper in doing its objective before the YARA rule runs.

☣️ Threat Landscape

Datadog threat roundup: Top insights for Q2 2025 by Greg Foss, Andy Giron and Matt Muir

~ Note, I work at Datadog, and Greg/Andy/Matt are all my colleagues! ~

The Security Research and Intelligence team at Datadog just published our Q2 threat roundup. We found lots of interesting attacks this quarter targeting the open source supply chain, containerized workloads and cloud environments. The highlight of this quarter was uncovering new TTPs from Mimo, who started building out their rolodex of CMS vulnerabilities and targets to deploy their malware.

Remote unauthenticated command injection by Fortinet PSIRT

This is a different CVE than the FortMajeure CVE listed above in State of the Art. CVE-2025-25256 affects Fortinet’s FortiSIEM product, which makes this even funnier because it attacks a security product that is supposed to detect attacks just like this one. Fortinet PSIRT says there are already practical exploits in the wild for this, so put your SIEM behind a gateway and don’t expose it to the Internet!

LazarOps: APT Tactics Targeting the Developers Supply Chain [PART 1] by SecurityJoes

I’ve posted several articles focusing on DPRK-Nexus threat actors compromising developers and the software supply chain. I’ve never seen an article on this subject from an Incident Response team that helped remediate an incident for a victim. This looks like a typical Contagious Interview attack chain, where the developer was contacted by a fake recruiter and tasked to solve a coding test in a malicious GitHub repository. The updated TTPs I’ve seen here include iterating through a list of 1000 potential domains in a pastebin file if the initial C2 communication fails.

How attackers are using Active Directory Federation Services to phish with legit office.com links by Luke Jennings

This is an excellent follow-up post on phishing techniques to Mehmet’s post above, so make sure to read that one first. During his phishing kit research, Jennings found a peculiar malvertising-style phishing page for Microsoft accounts.

Attackers found a clever open-redirect in office.com, not due to a vulnerability or open parameter, but rather due to how Office performs redirects to ADFS tenants for authentication. By creating an attacker-controlled ADFS tenant, you can tell Office to redirect to your Azure infrastructure, where you control the chain of redirects. This helps attackers make victims believe it’s a legitimate URL since it’s from the Office domain.

From Bing Search to Ransomware: Bumblebee and AdaptixC2 Deliver Akira by The DFIR Report

Excellent and concise DFIR Report write-up on an SEO poisoning attack that led to Akira ransomware. The threat actor targeted victims searching for OpManager and presented a fake website that packaged the legitimate software with Bumblebee. They installed AdaptixC2 next and began moving laterally throughout the network. Once they hit the primary domain controller, they exfiltrated sensitive files and started encrypting the network with Akira.

🔗 Open Source

seifreed/xrefgen

IDA Pro plugin that helps generates cross-references (XRefs) that IDA missed. It particularly focuses on modern compiled languages like Rust/Go/C++ and binaries with some advanced obfuscation techniques. It does note it’s designed to be used alongside Mandiant’s XRefer plugin.

0xJs/BYOVD_read_write_primitive

This is a neat PoC repository of several BYOVD techniques. The README contains excellent walkthroughs of each method and the -h output of the binaries with corresponding arguments.

arosenmund/defcon33_silence_kill_edr

Open-source release of a DEFCON 33 training dubbed “Putting EDRs in Their Place”. It’s a collection of techniques designed to bypass and kill EDRs once you land on a victim system that has the security tool in place. They have a bunch of setup tools, an emulated EDR, VMs, and an explanation of the techniques in their README.

Adaptix-Framework/AdaptixC2

Post-exploitation framework with your typical C2 operations, but with a front-end that looks similar to Cobalt Strike. You can build payloads cross-platform, use it as a teamserver to coordinate operations and write extensions to take advantage of it’s flexibility.