Kaspersky official blog
The guide on blocking ChatGPT, Gemini, Claude, and other AI tools at work | Kaspersky official blog 10 June 2026 at 13:53

The guide on blocking ChatGPT, Gemini, Claude, and other AI tools at work | Kaspersky official blog

10 June 2026 at 13:53

Unchecked AI in the workplace quickly becomes a massive loophole for data leaks and security breaches. All too often, employees drop sensitive company data into public chatbots, or install rogue AI assistants on their own — in the process handing over way too much access. In a previous post, we broke down the different types of risky AI systems, and later shared some tips on how to turn off the built-in AI features on major tech platforms. Today let’s take a look at practical ways to block or restrict the unauthorized “helpers” employees might be using — from ChatGPT and Grammarly, to meeting bots like Fireflies and Read AI.

How to detect and restrict ChatGPT

ChatGPT is the biggest culprit when it comes to unauthorized AI use worldwide. A quick word of warning, though: an outright ban only sends users hunting for sketchy third-party sites or messaging app chatbots that hook into the same service. That’s why it’s always a good idea to offer an approved alternative before pulling the plug.

Detecting it: keep an eye on the NGFW or web filter for traffic heading to chat.openai.com, chatgpt.com, oaistatic.com, oaiusercontent.com, or cdn.oaistatic.com. It’s also smart to use EDR/EPP tools to scan browser histories, installed apps, and browser extensions across corporate devices.

Locking it down: use the firewall or web filter to block the entire AI Services category, and set up DNS to reroute traffic away from those OpenAI domains. Browser policies can also be used to ban ChatGPT-powered extensions. Better yet, block all extensions not on a pre-approved allowlist. Finally, use application controls and EPP solutions to stop users from installing the official desktop app (ChatGPT.exe or com.openai.chat).

How to detect and restrict Claude and Claude Code

Detecting it: use the NGFW or web filter to track traffic going to claude.ai, anthropic.com, *.anthropic.com, and api.anthropic.com. EDR/EPP or application control tools can also be used to scan employee computers for the desktop app (claude.exe).

Locking it down: drop a blanket block on the AI Services category through the NGFW or web filter, and tweak DNS settings to reroute traffic away from the aforementioned Anthropic domains. Next, use browser policies to shut down Claude-powered extensions. Finally, use application controls and the EPP platform to prevent users from installing the desktop app.

How to detect and restrict Perplexity AI

Detecting it: keep tabs on the NGFW or web filter to flag any traffic heading to *.perplexity.ai or pplx.ai.

Locking it down: just like the others, add the AI Services category to the NGFW or web filter blocklist, and use DNS routing to redirect traffic away from those domains.

Configure the browser to block third-party extensions from being installed. If Firefox is used in the organization, be aware that recent versions come with Perplexity built in. Luckily, these AI features can be turned-off company-wide using enterprise policies — specifically, by setting SidebarChatbot = blocked. The full list of tweaks can be found in the Firefox documentation.

How to detect and restrict DeepSeek

Detecting it: keep an eye on the NGFW or web filter for traffic hitting deepseek.com, chat.deepseek.com, api.deepseek.com, or platform.deepseek.com. For better precision, analyze the SNI (server name identification) in TLS connection requests. For mobile devices, look out for the official app (com.deepseek.chat).

Locking it down: blocklist the AI Services category on the NGFW or web filter, and reroute traffic to DeepSeek’s domains via DNS settings. Use browser policies to block third-party extensions, and lean on MDM/EMM tools to restrict the mobile app.

How to detect and restrict Mistral, xAI Grok, and Character.ai

The playbook for these tools is exactly the same as DeepSeek, so here’s the quick list of domains to watch for and block: chat.mistral.ai, mistral.ai, console.mistral.ai, grok.com, x.ai, api.x.ai, character.ai, beta.character.ai, and c.ai.

A quick word of warning on Grok: because Grok is baked into X, blocking this specific AI access point means blocking the entire social media platform.

How to detect and restrict Slack AI

Detecting it: in the Slack workspace admin dashboard, look under Analytics → Slack AI usage. If an enterprise plan is used, the detailed Slack logs can be searched for any events starting with the ai_ prefix.

Blocking it with policies: in the organization’s Slack settings, click through the Workspace settings → Roles & permissions → Feature access, and change the permission to “no one”. Slack has a step-by-step guide in their help center.

Locking it down: shutting this down at the network level is tricky; it can be pulled off with a finely tuned CASB solution in place. Also, don’t forget the importance of blocking rogue integrations and keeping external AI services from tapping into Slack data in the first place. We covered how to lock this down using OAuth controls in a previous post.

How to detect and restrict Zoom AI Companion

Detecting it: if a corporate Zoom subscription is in use, just head to Admin Center → Reports → AI Companion usage. Detecting Zoom’s AI when employees join external meetings or use free accounts is a lot tougher, but email filters can be set up to flag incoming AI-generated meeting notes by scanning for subject lines or text containing “Meeting summary” or “Meeting assets”.

Blocking it with policies: for the company’s own Zoom subscription, go to the Admin Portal → Account Management → Account Settings → Meeting → AI Companion and toggle it OFF for everyone.

Locking it down: unfortunately, AI Companion is baked into Zoom’s DNA, so the only real option is blocking Zoom altogether.

How to detect and restrict Grammarly

What looks like an innocent spellchecker is actually one of the biggest culprits for workplace data leaks.

Detecting it: check the NGFW or web filter logs for traffic hitting grammarly.com, *.grammarly.com, and gnar.grammarly.com. EDR and MDM/EMM tools can also be used to hunt down the standalone desktop apps (Grammarly Desktop.exe and the macOS version), as well as the Grammarly browser extension.

Locking it down: use firewalls to block those domains at the network level, and EPP to stop employees from installing the desktop app, browser extensions, or the Grammarly add-ins for Microsoft Word and Excel.

How to detect and restrict meeting assistants: Fireflies, Read.ai, Tactiq, Fathom, and Granola

This massive category of third-party SaaS tools records and analyzes meetings — creating a massive risk for data leaks. The trickiest part? Outside clients or vendors can bring these bots into a meeting just as easily as employees can.

Detecting them: run an audit on calendar invites, and look for bot participants using email domains like @fireflies.ai, @read.ai, @tactiq.io, @fathom.video, or @granola.ai. Zoom, Teams, or Google Meet logs can also be used to review external participants who joined past calls.

Locking them down: since it’s impossible to control what outsiders do, blocking these bots comes down to tightening meeting rules. The best moves are: blocking users from granting OAuth permissions for bots to join calls, restricting employees from inviting unapproved external participants, or locking down meeting recording access for external users. That last option is usually the least painful way to keep bots out without disrupting business.

How to detect and restrict AI code editors: Cursor, Windsurf, and the like

Detecting them: use EDR/EPP tools to scan for executables like cursor.exe or windsurf.exe. It’s also worth monitoring network traffic heading to cursor.com and windsurf.com, as well as traffic hitting various AI model API providers. Keep in mind that there’s a pretty extensive list of API hosts to monitor here, since these editors aren’t tied to just one specific AI vendor.

Blocking them with policies: these apps can be prevented from being installed by setting up filters based on the developer’s digital signature certificate. Alternatively, a strict application allowlist can be employed where only pre-approved software is allowed to run.

Locking them down: rely on the EPP/EDR platform to actively detect and block these applications from running.

How to detect and restrict local AI tools: Ollama, LM Studio, and GPT4All

On one hand, this category carries fewer data leak risks because the AI models run completely locally on the user’s machine. On the other hand, it opens up a whole new can of worms: these apps themselves aren’t always highly secure, and can become targets for cyberattacks. Plus, it still means that employees can misuse models or process data in unauthorized ways.

Detecting them: EDR/EPP tools are the best line of defense here. They should be used to flag known local AI files and processes like ollama.exe, ollama serve, lmstudio.exe, LM Studio.app, jan.exe, or gpt4all.exe. From a network perspective, it’s worth scanning for open ports on local devices — typically port 1234 for Ollama and LM Studio, or port 8080 for WebUIs (using an additional fingerprint check of the server response). Another massive red flag is the presence of large files (often several gigabytes) containing language model weights. Look out for extensions like .gguf, .bin, or sometimes .safetensors.

Locking them down: use EPP/EDR platforms or windows AppLocker to block these applications by name, or switch to an application allowlist.

How to detect and restrict autonomous agents: OpenClaw, NemoClaw, and NanoClaw

This is easily one of the most dangerous categories of AI tools out there. These agents mix high-level independence with access to untrusted data, making them a massive security headache.

Detecting them: use EPP/EDR tools to sniff out active processes like openclaw, nanoclaw, nemoclaw, or clawdbot. Also keep an eye out for devices running Node.js that suddenly start launching Bash or Python scripts. Another dead giveaway is the appearance of system folders like ~/openclaw, ~/nanoclaw, ~/.claw*, or ~/clawhub. At the network level, monitor connections to the AI model APIs we mentioned earlier, as well as traffic hitting servers like openclaw.ai, nanoclaw.dev, or clawhub.*.

Locking them down: the safest bet is to use strict application allowlisting (only allowing approved software to run), or to specifically ban the known agent apps listed above. On top of that, consider blocking non-developers from installing Node.js and Docker, neither of which they need on their computers anyway.

Kaspersky official blog
A guide to disabling Copilot, Gemini, and Apple Intelligence | Kaspersky official blog 4 June 2026 at 21:16

A guide to disabling Copilot, Gemini, and Apple Intelligence | Kaspersky official blog

Kaspersky official blog

By: Stan Kaminsky

4 June 2026 at 21:16

Lately, software developers have been baking AI features straight into everyday work tools, operating systems, and browsers. In some cases, they’re genuinely handy. However, their presence introduces specific risks, which means plenty of companies are hesitant to give employees access to these tools. In a previous post, we categorized these unwanted AI systems, looked at how to spot them at the network and endpoint levels, and covered the ultimate universal kill switch: managing OAuth access across major corporate platforms. In this deep dive, we’re getting tactical: breaking down how to disable or restrict the AI built into popular platforms.

A quick heads-up: major software vendors occasionally change the names of their AI settings and tweak how they function. If any of the options mentioned below are missing or aren’t working as expected, a quick web search for the setting’s name will usually point you to its new location or branding.

How to turn off Microsoft 365 Copilot

Detection: you can check actual Copilot usage in the logs by going to Microsoft 365 admin → Copilot usage report.

Disabling via policies: in the Microsoft 365Admin Center, go to Settings → Integrated Apps, find Copilot in the Available Apps list, and select Block. More granular configuration policies are available under Customization → Policy Management. The Policies page here contains over two thousand entries, so you’ll want to filter them by the keyword “Copilot” (detailed guide). Given that Copilot is a paid add-on for Office, another way to block it — and save money by doing so — is to simply avoid assigning users SKUs that include Copilot.

We recommend separately blocking Copilot Chat, which is available in Teams, Edge, Outlook, and several other services. Yes, it’s not Copilot itself. And yes, it has to be blocked separately by following this guide.

Additional layer of protection: you can block the domains copilot.cloud.microsoft and m365.cloud.microsoft/chat at the web filter or NGFW level. However, Microsoft explicitly advises against this, warning that it could break other Microsoft 365 features.

How to turn off Windows Copilot

Beyond the Office version of Copilot, you also need to manage its consumer-facing cousin.

Detection: look through your NGFW or other network logs for traffic hitting copilot.microsoft.com, bing.com/chat, or edgeservices.bing.com.
Disabling via policies: in Windows Group Policy, navigate to Computer Config → Admin Templates → Windows Components → Windows Copilot. In Microsoft 365 Group Policy, go to Admin center → Block consumer Copilot for organizational accounts.

Additional layer of protection: block the Copilot.exe executable from running entirely.

How to turn off the Copilot sidebar in Edge

Detection: look through your NGFW or other network logs for traffic hitting copilot.microsoft.com, bing.com/chat, or edgeservices.bing.com.

Blocking: configure the following MS Edge Group Policies: HubsSidebarEnabled = false, EdgeShoppingAssistantEnabled = false, CopilotPageContext = Disabled (false), CopilotNewTabPageEnabled = false, Microsoft365CopilotChatIconEnabled = false, GenAILocalFoundationalModelSettings = 1 (note that disabling this unexpectedly requires a 1 instead of a 0).

Second layer of protection: block the domains copilot.cloud.microsoft and m365.cloud.microsoft/chat at the web filter or NGFW level. However, Microsoft explicitly advises against this, warning that it could break other features.

How to turn off the Gemini Assistant in Google Workspace

Detection: check the Workspace Admin Console (admin.google.com), Gemini usage report section.

Blocking via policies: in the Admin Console, navigate to Apps → Additional Google services → > Gemini app, and set it to OFF. Then, go to Manage Workspace smart feature settings → Smart features in Google Workspace, and set it to OFF.

Second layer of protection: block network traffic to the domains gemini.google.com, bard.google.com, and aistudio.google.com.

How to turn off Gemini in Google Chrome

Detection: check your Chrome Enterprise reports (Chrome management → Reports), or look through network traffic logs for connections to the previously mentioned domains.

Blocking via policies: in your Chrome Enterprise policies, configure the following settings: GenAILocalFoundationalModelSettings = 0, HelpMeWriteSettings = 2 (disabled), TabOrganizerSettings = 2, CreateThemesSettings = 2, DevToolsGenAiSettings = 2.

Additional layer of protection: block network traffic to the domains gemini.google.com, bard.google.com, and aistudio.google.com. Additionally, block unauthorized Chrome/Chromium installations (those outside your policy management) with the help of host-based application control tools like EPP/EDR or AppLocker.

How to turn off Apple Intelligence

Detection: on your NGFW and web filters, traffic hitting apple-relay.apple.com and *.apple-cloudkit.com is a clear indicator that Apple Intelligence is active.

Blocking via policies: any managed Apple device allows you to disable individual AI features, though there isn’t a master switch you can flip to shut down “all AI”. In your MDM profile, you need to set the following keys to false (disabled): allowWritingTools, allowMailSummary, allowGenmoji, allowImagePlayground, allowImageWand, allowPersonalizedHandwritingResults, allowExternalIntelligenceIntegrations, allowExternalIntelligenceIntegrationsSignIn, allowNotesTranscription, and allowNotesTranscriptionSummary. Here is a brief configuration example:

<dict>
<key>PayloadType</key>
<string>com.apple.applicationaccess</string>
<key>allowWritingTools</key>
<false/>
<key>allowMailSummary</key>
<false/>
</dict>

Despite Apple’s shift toward declarative device management, these AI features still need to be managed through traditional MDM payload settings.

Second layer of protection: block network traffic to the hosts mentioned above — though the obvious downside for mobile devices is that this won’t work once they leave the corporate network.

Schneier on Security
AI Used to Decrypt Medieval Ciphers 3 June 2026 at 13:04

AI Used to Decrypt Medieval Ciphers

Schneier on Security

By: Bruce Schneier

3 June 2026 at 13:04

Researchers are using machine learning algorithms to decrypt historical pencil-and-paper ciphers.

AWS Security Blog
Enabling AI sovereignty on AWS 12 May 2026 at 17:18

Enabling AI sovereignty on AWS

AWS Security Blog

By: Stéphane Israël

12 May 2026 at 17:18

Cloud and AI are transforming industries and societies at unprecedented speed, from accelerating research and enhancing customer experiences to optimizing business processes and enriching public services. At Amazon Web Services (AWS), we believe that for the cloud and AI to reach their full potential, customers need control over their data and choices for how and where they run their workloads. In 2022, we formalized our commitment to control and choice—offering all AWS customers the most advanced set of sovereignty controls and features available in the cloud with the AWS Digital Sovereignty Pledge. As AI adoption accelerated, we’ve been working with customers to help them embrace AI innovation while meeting sovereignty requirements. We’re committed to ensuring customers can continue to harness AI’s transformative capabilities without compromising on the capabilities, performance, innovation, security, and scale of the AWS Cloud to meet their sovereignty needs, including AI sovereignty. Our approach to AI sovereignty is grounded in a deep understanding of these needs and the real-world implementation challenges that come with them.

Through discussions with customers, partners, analysts, and regulators, we’ve learned that digital sovereignty—and AI sovereignty—means different things to different stakeholders. Each country and region has unique, evolving sovereignty requirements, with no uniform guidance on which workloads or sectors must comply. Despite this variation, we’ve identified consistent themes: data sovereignty (including data residency and operator access restrictions) and operational sovereignty (including resilience, survivability, and independence). AI sovereignty builds on these foundations, adding emerging considerations such as preserving cultural norms, values, and local languages in AI outputs. Ultimately, meeting digital and AI sovereignty requirements comes down to providing customers with more control and choice.

Enabling customer control and choice across the AI stack

AI sovereignty requires control and choice across the AI stack—comprehensive cloud infrastructure that combines compute, networking, data management, security controls, specialized application services, and talent. This includes the ability to make deliberate choices across the stack such as location, dependencies, services, and partners that align with customers’ unique needs, regulatory requirements, and innovation objectives. With AWS, customers can develop AI on a trusted foundation where their data remains secure and under their control. Customers have the freedom to choose from a comprehensive range of AI optimized chips—including purpose-built AWS silicon and chips from NVIDIA, AMD, and Intel—so they can select the right chip for the right workload. AWS applies two decades of learned expertise to our comprehensive AI stack, enabling organizations to maintain complete control over their data and operations while accessing cutting-edge capabilities to solve local challenges.

AWS provides customers with the infrastructure and tools to embed AI across the full value chain—not just in isolated use cases, but as a foundational capability enabling them to train and deploy models and build sophisticated AI and generative AI applications with exceptional performance. This enables customers to focus on innovation instead of their infrastructure, bringing the cloud to where they need it most with a range of options including AWS AI Factories, AWS Outposts, AWS Local Zones, AWS Dedicated Local Zones, and AWS Regions including the AWS European Sovereign Cloud. For example, customers who require dedicated deployments to meet their sovereignty requirements for their mission-critical AI workloads can use AWS AI Factories. These physically isolated, dedicated deployments built exclusively for the customer combine the latest AI infrastructure, including AWS Trainium accelerators, NVIDIA GPUs, dedicated networking, and storage. AWS AI Factories address AI sovereignty needs by delivering on-premises AI capabilities to securely perform training, fine tuning and real-time inference.

The AWS AI portfolio offers a comprehensive range of services—from foundation models (FMs) through Amazon Bedrock, to machine learning offerings like Amazon SageMaker, application services like Amazon Q, and developer tools like Kiro—designed to give customers control over their data and choice in how they deploy AI. With Amazon Bedrock, customers can choose from hundreds of models from leading providers like AI21 Labs, Anthropic, Amazon, Cohere, Mistral AI, and OpenAI. Customers can evaluate and select the most suitable FMs for their specific needs and choose where they deploy them, and fine-tune models privately with their own data. Customers are always in control of their data. Critically, no customer inputs to or outputs from Amazon Bedrock are used to train Amazon Nova or any third-party models.

Supporting national AI strategies

Successful AI strategies require building a holistic environment nurturing local talent, supporting startups, developing industry-specific applications, and fostering public-private partnerships. The cloud has transformed AI from an exclusive technology requiring massive investment into an accessible tool for innovation across all sectors and organization sizes. While technical infrastructure gets much of the attention when considering AI sovereignty, the cultural and strategic dimensions of national FMs are equally critical. These FMs aren’t merely computational tools, they can encode elements of cultural knowledge, linguistic nuance, and societal context, making local relevance a design consideration rather than an afterthought. These FMs serve purposes that extend beyond technical capabilities. Locally trained FMs can reflect national educational curricula and cultural values while understanding local legal systems, business practices, and regulatory frameworks. Models trained on local languages, dialects, and cultural contexts support linguistic diversity and help underrepresented languages gain representation in AI products and services.

AWS supports vital national priorities and customers’ missions, such as the preservation of culture norms, values, and local languages development of regional and local language model capabilities. To customize models, customers can use Amazon SageMaker AI for voice, domain specialization, and to evaluate models for accuracy. For example, the first Greek LLM made available in March 2024 was Meltemi—built on top of Mistral-7B, running on AWS infrastructure, and continually pretrained to extend its proficiency in the Greek language using a dataset of 28.5 billion Greek tokens. Meltemi is available on HuggingFace. SEA-LION—a family of open source, multilingual LLMs for Southeast Asia—was trained entirely on AWS with managed GPU clusters. Their team completed a 3B-parameter model in only 3 months—a 60% faster timeline than comparable on-premises projects.

Verifiable control over data access

Sovereignty isn’t only about where data resides—it’s about who can access it and under what conditions. In the AI context, access restriction extends beyond infrastructure to cover model inputs, outputs, training processes, and the operational environments in which AI runs. Unlike traditional infrastructure, AI workloads introduce new access surfaces: the model itself, the data used to train it, and the inference pipeline through which sensitive inputs flow. This furthers the need for verifiable governance and identity propagation in IT systems.

To help ensure the confidentiality and integrity of customer data, all modern Amazon Elastic Compute Cloud (Amazon EC2) instances including those that offer AI accelerators, such as AWS Inferentia and AWS Trainium, are backed by the industry-leading security capabilities of the AWS Nitro System. By design, there is no mechanism for anyone at AWS to access customer data on Nitro EC2 instances that customers use to run their workloads. AWS services—including those with AI capabilities built on Amazon EC2—inherit these same protections. These protections apply to AI data running in the AWS Nitro System so that they’re protected at every stage—from model training to inference. The NCC Group, an independent cybersecurity firm, has validated the design of the Nitro System. We believe providing this level of transparency is critical in building and sustaining trust.

As AI agents increasingly take actions across systems on behalf of users, controlling who and what can access resources—and ensuring appropriate human oversight—becomes critical. AWS Identity and Access Management (IAM) helps ensure that only authorized users and applications can access AI resources through fine-grained permissions and comprehensive audit trails. For AI agents and automated workloads, Amazon Bedrock AgentCore Identity provides identity and credential management, so agents operate with the right permissions and nothing more.

Transparency and assurance

Transparency is at the core of our digital sovereignty commitment. We provide comprehensive industry-leading technical measures, operational controls, and contract protections that give customers control over where they locate their data, who can access it, and how it’s used. To give greater assurance on how AWS services are designed and operated, we continue to seek out and secure third-party attestations, accreditations, and certifications that help our customers meet their compliance needs.

We continue to deepen our assurances and transparency to customers—such as updating our AWS Service Terms to reflect our technical protections commitments (e.g. AWS Nitro System), providing detailed commitments as to our handling of third-party requests for customer data in our agreements, and providing supplemental explanations and resources (e.g. CLOUD Act blog) to empower customers to make informed choices on sovereignty matters. These efforts extend into our commitment to responsible AI, providing customers the confidence to build and operate AI applications responsibly using AWS Services. ISO/IEC 42001 is an international management system standard that outlines requirements and controls for organizations to promote the responsible development and use of AI systems. AWS is the first major cloud service provider to achieve ISO/IEC 42001 accredited certification for AI services, covering Amazon Bedrock, Amazon Q Business, Amazon Textract, and Amazon Transcribe. In November 2025, AWS successfully completed its first surveillance audit for ISO 42001:2023 with no findings, reiterating the continual commitment of AWS to responsible AI practices.

Innovative technology requires a secure and trustworthy foundation. AWS supports more than 140 security standards and compliance certifications that our customers and partners can inherit to help comply with local laws and regulations. For two decades, we’ve deeply engaged with regulators and cybersecurity authorities to align our offerings with national priorities and ensure our solutions support both innovation and control. We actively contribute to frameworks that respond to new developments without stifling progress.

Sustained commitment to helping customers achieve their sovereignty goals

AWS is committed to giving customers the same control and choice over their AI systems as they have over their data. We help customers harness AI’s transformative power while maintaining the capabilities, performance, innovation, security, and scale of AWS Cloud. As cloud and AI evolve, AWS will continue offering the most advanced sovereignty controls and features available.

If you have feedback about this post, submit comments in the Comments section below.

Kaspersky official blog
LLMjacking: what these attacks are, and how to protect AI servers 12 May 2026 at 22:35

LLMjacking: what these attacks are, and how to protect AI servers

Kaspersky official blog

By: Stan Kaminsky

12 May 2026 at 22:35

AI security covers more than just data theft prevention, restricting rogue AI agents, or stopping assistants from giving harmful advice. A relatively simple but rapidly scaling threat has emerged: attempts to hijack computational power and exploit someone else’s neural network for personal gain. This is known as LLMjacking. With AI compute costs widely predicted to surge dramatically, the number of attackers driven by these motives is poised to grow. Consequently, when deploying proprietary AI servers and their supporting ecosystems like RAG or MCP, it’s critical to establish rigorous security measures from day one.

Statistics from a honeypot

The speed and scale of these resource-hijacking attempts are best illustrated by an experiment documented in detail in April 2026. The investigator configured a Raspberry Pi to masquerade as a high-performance private AI server, and made it accessible from the internet. When queried, it reported the availability of Ollama, LM Studio, AutoGPT, LangServe, and text-gen-webui servers — all tools commonly used as wrappers for locally hosted AI models. The server also appeared ready to accept API requests in the OpenAI format, which has become the industry standard.

All these services were seemingly powered by a local instance of Qwen3-Coder 30B Heretic, one of the most powerful open-source models, with its safety alignment removed. To throw in a sweetener, the honeypot reported the presence of various RAG databases and an MCP server with tempting capabilities like get_credentials on board.

In reality, the Raspberry Pi was simply hosting 500 pre-saved responses from an actual Qwen3 model, with a lightweight script selecting the most relevant answer for each incoming query. This setup was enough to pass a superficial check while allowing the researcher to probe the attackers’ intentions.

According to the author, Shodan, a popular internet scanning service, discovered the server within three hours of its going live. Just one hour later, requests resembling capability reconnaissance began pouring in. Over the following month, the server handled more than 113 000 requests from thousands of unique IPs, with 23% of that traffic specifically targeted at discovering AI capabilities and exploiting local LLMs and AI agents.

Requests to endpoints like /api/tags and /v1/models allow attackers to fingerprint which models are hosted on a server, while scanning for /.cursor/rules typically precedes an attempt to exploit an AI agent. Similarly, checking /.well-known/mcp.json serves as an inventory of the victim’s MCP servers. While the author makes no mention of the total number of attacks that progressed beyond simple scanning, there were 175 active attempts to hijack the LLM during the final week of the experiment alone.

What are the attackers after?

Based on the researcher’s observations, none of those targeting the decoy server attempted to execute arbitrary code or gain root access. (Editorial note: this is surprising and may point to gaps in logging.) Almost all attacks were aimed at siphoning resources. For example, the following activities were logged during the experiment:

A well-structured attempt to parse technical documentation for a microprocessor
A prompt to write an erotic novel
Requests to parse and structure social media text data regarding new vulnerabilities
An attempt to call Anthropic models using the compromised server as an API proxy

It’s worth noting that the reconnaissance of AI resources uses standardized and rapidly evolving tools. Requests from an application named LLM-Scanner originated from the infrastructure of seven different cloud providers across eight countries, suggesting that the raiders have put established methodologies in place, as well as specialized platforms for sharing techniques. By the third week of the experiment, the scanner had been updated with an additional check: it now used simple abstract questions to determine whether it’s interacting with live AI or a honeypot returning canned responses.

Among the non-specific attacks, the experiment recorded numerous attempts to exfiltrate credentials from the .env file. Attackers systematically hunted for this file across every conceivable directory on the server. Leaving an .env file publicly accessible is one of the most elementary mistakes when deploying projects on Laravel, Node.js, and other frameworks, yet it remains a common oversight — particularly among beginners and vibe coders. Consequently, attackers have every reason to expect their efforts to pay off.

Conclusions and defense tips

Scanning publicly accessible servers and attempting to exploit them is nothing new, but the rise of LLMs gives attackers another way to monetize their efforts — one that’s both highly lucrative for them and devastating for their victims. To understand how massive these attacks could become, look at their closest counterpart: the cryptojacking market — where criminals mine cryptocurrency using stolen computational resources. That market grew by 20% in 2025 alone. As AI-powered solutions proliferate, and as major providers hike subscription costs while local AI chips remain in short supply, we should expect LLMjacking to become an industrial-scale phenomenon.

Key defensive measures for private AI infrastructure

For AI systems running locally on a single machine, ensure that servers like LM Studio, Ollama, or similar are configured to accept connections only on the local interface (localhost), rather than all available network interfaces. This restricts LLM access to the host machine itself, and prevents the AI from being reachable over the internet.
For servers handling remote requests — even if the server only operates within a local corporate network — implement robust authentication and authorization rather than relying solely on API key validation. Solutions based on OIDC or OAuth2 with short-lived tokens are the most effective. This not only defends against LLMjacking, but also allows for more granular tracking of user activity, and prevents API key abuse. Furthermore, keys must be protected from more than just external attackers; a growing risk is the misuse of keys by AI agents themselves. This applies to LLM interfaces as well as MCP, RAG, and others.
Use network segmentation and IP allowlists to give AI server access only to the departments, employees, and services that require it.
Ensure that all client-server connections are secured with a current version of TLS.
Apply the principle of least privilege by separating access to specific services; for instance, MCP and LLM components should have their own distinct access tokens.
Ensure an EDR security agent is installed on all workstations and servers, including those hosting AI models.
Monitor AI resource consumption, establish usage quotas for different employee roles, and set up alerts for anomalous activity spikes.
Maintain detailed logs of LLM responses and requests made to the model and its supporting tools. Integrate these data sources with your SIEM. Ensure logs are resilient against tampering or deletion.

Kaspersky official blog
A practical guide to secure vibe-coding for small businesses | Kaspersky official blog 28 April 2026 at 17:55

A practical guide to secure vibe-coding for small businesses | Kaspersky official blog

Kaspersky official blog

By: Stan Kaminsky

28 April 2026 at 17:55

The entry barriers for app development have plummeted in recent times — with nearly anyone now able to build a professional website, personal news bot, or dashboard simply by giving a chatbot or AI agent a few instructions in natural English. Unfortunately, a massive gap exists between a slick prototype and a reliable, production-ready, secure application. To avoid becoming the subject of another AI fail story, or losing money and sensitive data, follow these straightforward tips. These are intended specifically for non-technical creators and very small teams. Larger enterprises should follow more sophisticated recommendations.

The primary risks of AI-generated code

While vibe coding can deliver a seemingly functional app in just a few hours, it will likely contain dangerous flaws. AI models are trained on code samples from across the internet, which often include suboptimal tutorials, buggy snippets, and outright junk. Sometimes this code simply fails to run, but more often the situation is subtler and more hazardous: the app appears to work, yet under the hood, it might rely on a crude imitation of the required logic or contain critical vulnerabilities. According to a study by the Cloud Security Alliance AI Safety Initiative, the following facts should be considered when using AI for coding:

At least 45% of AI-generated code contains dangerous vulnerabilities, such as failing to verify the user before granting access to sensitive data.
A professional developer using AI can write code three to four times faster, but may introduce 10 times as many vulnerabilities.
Twenty percent of AI-generated code attempts to use external libraries and modules that don’t actually exist.
Even when an application handles confidential data — such as payments, private messages, or documents — AI-generated code sometimes skips credential verification entirely. This can leave the app’s data open for anyone on the internet to read.
In other instances, the app might correctly prompt for a username and password but fail to enforce access controls, allowing any registered user to view everyone else’s data.
Access keys (tokens) for databases and AI services may be embedded directly into the source code, easy to steal, and difficult to rotate after a data breach or cyberattack.
Project code or critical build outputs are often deployed to servers without proper access restrictions, leaving both the application logic and sensitive access keys vulnerable to theft.
AI may implement insecure database access patterns, which can allow attackers to bypass the application to steal data or execute arbitrary code on the database server.
Apps that include API functionality often suffer from insecure API implementations, lacking both user permission checks and rate limiting.

Core principles of securing vibe code

Always verify. Treat AI-generated code as a rough draft. It should always be reviewed and rigorously tested. Ideally, professional developers should handle this; however, if none are available, the vibe-coder should at least test the application themselves, have friends or colleagues poke around the live app, and ask them to review key code snippets. It’s also possible to evaluate code integrity by submitting a separate prompt to the AI: “Review this code for secure development best practices and check for OWASP Top 10 vulnerabilities”.

Protect secrets. Never include passwords, API keys, or any other sensitive data in AI prompts. Instead, instruct the AI to write code that securely stores all secrets in environment variables (special hidden settings).

Prioritize efforts. The main risks emerge when an application is network-accessible to outsiders, processes valuable data, or runs on infrastructure that would be useful to attackers. The components of an app or system that meet these criteria are precisely what’s needed to be protected first. A static website composed of three HTML pages faces significantly lower risk than a loyalty program integrated into an online store.

Make security an explicit requirement. Even a simple, straightforward line in the prompt, like “Follow industry standards and security best practices when generating this code”, improves the output. Providing more specific requirements for critical code snippets makes the results even better.

Don’t trust default settings. Often, the danger in vibe coding lies in the configuration rather than the code itself. For example, an app processing sensitive company data might be deployed on a public vibe-coding platform (Lovable or the like), and remain accessible to the entire internet by default. Even if the code is flawless, making that information public is a critical security failure. Because of this, every component — from hosting and database settings to the deployment pipeline — must be manually reviewed and properly configured. If the purpose of a setting is unclear, consult a chatbot for the optimal values, specifying that its goal is to enhance security, and describing who the app is intended for.

Security is a continuous process. Securing the app should not be treated as a one-off task. Every time an application is updated, hosting providers are changed, or a project undergoes any other major shift, all steps in making it secure should be revisited, and the risks reassessed.

Tips for securing vibe code

It’s natural to want an app built from broad prompts like “Make me a beautiful, user-friendly, fast, reliable, and secure app for [use case].” However, for the results to actually be effective, each of those requirements needs to be fleshed out. Below, we’ve outlined recommendations for building standard components that will make vibe code more secure. It’s important to emphasize that “more secure” doesn’t mean “perfectly secure” — these approaches lower the risk, but that risk remains well above zero.

Demand security from the AI. When assigning a task to a neural network, be explicit: “write secure code, validate data, encrypt passwords”. Each type of task requires its own security prompt. For instance, don’t just ask to “build a login form”. Instead, ask for a “secure login form with credential validation, authentication and authorization (user permissions) controls, brute-force protection, password hashing according to modern standards, transmission strictly over HTTPS, and no hardcoded secrets”. It makes sense to use these secure requirement templates every time. It’s also helpful to keep a short cheat sheet of standard requirements for AI prompts: “validate all external data and user input before processing”, “no secrets in code”, “protect APIs from abuse”, “restrict user permissions”, and “secure default settings”.

Use off-the-shelf solutions. If an app needs a user management system, insist on using a popular, reputable library, such as NextAuth, Auth0, and so on, rather than inventing a new and vulnerable solution. This is the most common cause of data breaches. This applies to more than just login and registration; for other high-risk actions like file uploads and API call processing, it’s better to use established frameworks and libraries with built-in protections rather than building everything from scratch.

Don’t trust the AI blindly; verify open-source components. Neural networks often try to inject non-existent components and libraries into a project or suggest outdated versions. Always search for the suggested names online to ensure they are real, widely used, and secure — and make sure the latest versions are used.

Demand robust encryption. Explicitly state that modern industry standards must be used for both data transmission and storage: TLS 1.3 based on OpenSSL for network traffic; argon2 or bcrypt for hashing credentials; and so on.

Never trust user input. Always instruct the AI to include validation for any data entered by users, whether in forms or search bars. Use terms like “parameterization” and “sanitization” to emphasize that the app needs protection against malicious actors, not just users’ typos.

Set limits on user actions. Require the AI to implement rate limiting for login attempts or general requests. This will protect a project from automated attacks like DoS and brute-force password guessing.

Hide the system’s inner workings. If the site crashes, users should see a simple apology page rather than a detailed error report containing snippets of the code. That kind of information is a goldmine for hackers.

Remember that you’re a developer, and you need to protect development-related digital assets. All related accounts — such as access to GitHub, project hosting, and other resources — are prime targets for attackers. Be sure to enable two-factor authentication (2FA) on all work accounts.

Make backups. Regularly back up a project both locally and to the cloud to protect it against critical AI errors as well as cyberattacks. These backups should include both the application’s source code and its databases.

Set up a sandbox. Test new features and app versions in a secure environment using a clone of an active site or app and a copy of a database. Always run thorough tests before pushing an update live. This allows catching issues without putting users or their data at risk.

Update dependencies and scan them for vulnerabilities. A vibe-coded app will almost certainly rely on third-party libraries and components, known as dependencies. It’s wise to update these regularly by rebuilding an app with the latest versions, even if app’s code itself has not been changed. This process helps patch known security flaws in the used packages.

Check for secrets leaking into the repository. Use secrets scanners like TruffleHog to audit resulting code. Even with instructions, AI might slip up and include an API key or password in the source code. A scanner ensures that files containing keys and passwords don’t end up in Git or get published alongside the project.

AWS Security Blog
Building AI defenses at scale: Before the threats emerge 7 April 2026 at 20:02

Building AI defenses at scale: Before the threats emerge

AWS Security Blog

By: Amy Herzog

7 April 2026 at 20:02

At AWS, we’ve spent decades developing processes and tools that enable us to defend millions of customers simultaneously, wherever they operate around the world. AI has been an extremely helpful addition to the automation our security and threat intelligence teams do every day, and we’re still early in this journey. Our AI-powered log analysis system has reduced the time SecOps engineers spend analyzing security logs from an average of six hours to just seven minutes, a 50x productivity increase that lets us detect and respond to threats faster than ever. Across AWS, we analyze over 400 trillion network flows per day to detect patterns that signal emerging threats. In 2025 alone, we blocked over 300 million attempts to maliciously encrypt customer files hosted on Amazon S3. At this scale, every improvement in our operations helps protect all customers. AI is already helping us make our defenses stronger for everyone, and I’m excited to see that improvement continue.

A new class of AI for cybersecurity

Today, Anthropic announced Project Glasswing, a cybersecurity initiative designed to secure the world’s most critical software and advance the cybersecurity practices the industry will need as AI grows more capable. Organizations that build or maintain critical digital infrastructure are getting early access to Claude Mythos Preview, a new class of AI model, to find and patch vulnerabilities in the systems the world depends on. Given our role in securing some of the world’s most essential infrastructure, AWS is playing an integral part in advancing this work.

As part of Project Glasswing, we’ve already applied Claude Mythos Preview to critical AWS codebases that undergo continuous AI-powered security reviews, and even in those well-tested environments, it’s helped us identify additional opportunities to strengthen our code. In our internal testing, Claude Mythos Preview has proven more productive than previous models at surfacing security findings, requiring less manual guidance from our engineers to deliver actionable results. We’ve also given early access to a select group of AWS customers, who are deploying Claude Mythos Preview in their own security workflows and helping shape how the model evolves.

As AI tools grow more powerful in their ability to identify security issues, so must our ability to use them defensively. To that end, we’ve been working closely with Anthropic to help ensure Claude Mythos Preview is ready for enterprise use. AWS is Anthropic’s primary cloud provider for mission-critical workloads, safety research, and foundation model development. More broadly, AWS provides the foundational infrastructure that the world’s leading AI companies rely on to build, train, and deploy their most advanced models. We’re bringing decades of security experience to this partnership, helping to ensure Claude Mythos Preview is ready for even more organizations to build upon and operate securely at scale.

Claude Mythos Preview signals an upcoming wave of models that can find vulnerabilities and build working exploits at a scale and speed we haven’t seen before. Anthropic and AWS are taking a deliberately cautious approach to release. Access begins with a small number of organizations, prioritizing internet-critical companies and open-source maintainers whose software and digital services impact hundreds of millions of users. The goal: find and fix vulnerabilities in the world’s most critical software. Claude Mythos Preview is available in gated research preview through Amazon Bedrock with enterprise-grade security controls, including customer-managed encryption, VPC isolation, and detailed logging, so your team can explore Claude Mythos Preview’s capabilities without exposing production assets to unnecessary risk.

AWS architects services with security at the core

Our work with Project Glasswing is grounded in a philosophy we’ve developed over two decades of securing mission-critical workloads: you can’t wait for threats to materialize before building your defenses. You have to look around corners, adopt new technologies, build protections first, deploy them in your own operations at scale, and refine them based on what you learn.

That’s exactly what we’ve done at AWS with AI and security. Our approach spans the full spectrum: proactive defense through threat hunting and vulnerability research, dynamic response to active campaigns, and third-party certifications that verify our security practices meet the highest industry standards. This operational experience has taught us where AI accelerates security work and where human judgment remains essential. And it’s reinforced that security innovation must be pragmatic: proven in production before we ask you to rely on it.

That’s also why we help define what secure AI looks like. We became the first major cloud provider to achieve ISO 42001 certification for AI services. We’re active participants in OWASP, the Coalition for Secure AI, and the Frontier Model Forum. And we co-founded the Open Cybersecurity Schema Framework (OCSF) to enable better threat intelligence sharing across the ecosystem. The AWS Nitro System provides mathematically proven isolation for workloads. Systems and services like KMS, Nitro, EKS, and Lambda are designed with zero-operator access architectures, meaning AWS personnel can’t access your data. These aren’t aspirational goals. They’re how we operate today, at scale, every day.

Amazon Bedrock is where these principles come to life for AI. Bedrock provides policy-enforced access controls, built-in evaluation tools to measure how effectively models identify and validate vulnerabilities, and the ability to run workloads inside your own virtual private cloud. AWS is also the first cloud provider to achieve FedRAMP High and Department of Defense Security Requirements Guide Impact Level 4 and 5 authorizations for generally available Claude foundation models. Amazon Bedrock is already where the most security-sensitive organizations trust Anthropic’s technology, and it makes perfect sense for Claude Mythos Preview.

How to get started today

The same principles that guide our work at AWS scale apply regardless of which AI tools you’re using: comprehensive observability, defense in depth, automation where it adds value, and human judgment where it’s essential. Here’s how to put them into practice.

Prepare for the next generation of AI security. Claude Mythos Preview signals an upcoming wave of AI models that will transform cybersecurity. Start strengthening your security posture now so your organization is ready as these capabilities become more broadly available. Claude Mythos Preview is available in gated preview through Amazon Bedrock, and access is limited to an initial allow-list of organizations. If your organization has been allow-listed, your AWS account team will reach out directly.

Run on-demand penetration testing with AWS Security Agent. Now generally available, AWS Security Agent delivers autonomous penetration testing that operates 24/7 at a fraction of the cost of manual penetration tests. It transforms penetration testing from a periodic bottleneck into an on-demand capability that scales with your development velocity across AWS, Azure, GCP, other cloud providers, and on-premises. AWS Security Agent represents a new class of frontier agents: autonomous systems that work independently to achieve goals, scale to tackle concurrent tasks, and run persistently without constant human oversight. It deploys specialized AI agents to discover, validate, and report security vulnerabilities through sophisticated multi-step scenarios. Unlike traditional scanners that generate findings without validation, AWS Security Agent identifies potential vulnerabilities, then attempts to exploit them with targeted payloads and attack chains to confirm they are legitimate security risks. Each finding includes CVSS risk scores, application-specific severity ratings, detailed reproduction steps, and remediation suggestions. The result: penetration testing that once took weeks now completes in hours, scales across your entire application portfolio, and helps you get started with remediation instead of leaving you with a report. New customers can explore AWS Security Agent with a 2-month free trial.

Build AI applications you can trust with Amazon Bedrock. For teams building with generative AI, the challenge isn’t just making AI work, it’s making AI work safely. Amazon Bedrock provides the security and safety controls you need to deploy AI responsibly. Its Automated Reasoning capability is the first and only AI safeguard to use formal logic to help prevent factual errors from hallucinations, providing verifiable explanations with 99% accuracy, a capability we’ve refined over more than a decade of applying formal methods across AWS storage, identity, and networking. Amazon Bedrock also provides customizable guardrails that block harmful content and enforce your content policies, along with comprehensive observability to track AI behavior and detect anomalies across your workloads.

The threat landscape isn’t waiting

The threat landscape isn’t waiting for us to catch up. Nation-state actors, ransomware operators, and supply chain attackers are already using AI to scale their operations. Our job is to stay ahead by building defenses first, deploying them at scale, and sharing what we learn so the entire community benefits.

That’s what we do every day at AWS. We build in security from the start, ensuring it works and scales before we ask customers to rely on it. We set standards rather than follow them. And we look around corners to address tomorrow’s challenges today.

As AI capabilities continue to evolve, this approach won’t change. We’ll keep building defenses first, refining them at scale, and working with partners like Anthropic to ensure the next generation of AI security tools meets the real-world needs of enterprises defending at this scale.

Learn More

Get started with AWS Security Agent
Explore Amazon Bedrock Guardrails for AI content safety
See how we’re Securing AI at AWS
Learn about AWS Responsible AI
Read about AWS AI Compliance
Review our AWS Security Bulletins on emerging threats

If you have feedback about this post, submit comments in the Comments section below.

Kaspersky official blog
Key OpenClaw risks, Clawdbot, Moltbot | Kaspersky official blog 16 February 2026 at 14:16

Key OpenClaw risks, Clawdbot, Moltbot | Kaspersky official blog

Kaspersky official blog

By: Stan Kaminsky

16 February 2026 at 14:16

Everyone has likely heard of OpenClaw, previously known as “Clawdbot” or “Moltbot”, the open-source AI assistant that can be deployed on a machine locally. It plugs into popular chat platforms like WhatsApp, Telegram, Signal, Discord, and Slack, which allows it to accept commands from its owner and go to town on the local file system. It has access to the owner’s calendar, email, and browser, and can even execute OS commands via the shell.

From a security perspective, that description alone should be enough to give anyone a nervous twitch. But when people start trying to use it for work within a corporate environment, anxiety quickly hardens into the conviction of imminent chaos. Some experts have already dubbed OpenClaw the biggest insider threat of 2026. The issues with OpenClaw cover the full spectrum of risks highlighted in the recent OWASP Top 10 for Agentic Applications.

OpenClaw permits plugging in any local or cloud-based LLM, and the use of a wide range of integrations with additional services. At its core is a gateway that accepts commands via chat apps or a web UI, and routes them to the appropriate AI agents. The first iteration, dubbed Clawdbot, dropped in November 2025; by January 2026, it had gone viral — and brought a heap of security headaches with it. In a single week, several critical vulnerabilities were disclosed, malicious skills cropped up in the skill directory, and secrets were leaked from Moltbook (essentially “Reddit for bots”). To top it off, Anthropic issued a trademark demand to rename the project to avoid infringing on “Claude”, and the project’s X account name was hijacked to shill crypto scams.

Known OpenClaw issues

Though the project’s developer appears to acknowledge that security is important, since this is a hobbyist project there are zero dedicated resources for vulnerability management or other product security essentials.

OpenClaw vulnerabilities

Among the known vulnerabilities in OpenClaw, the most dangerous is CVE-2026-25253 (CVSS 8.8). Exploiting it leads to a total compromise of the gateway, allowing an attacker to run arbitrary commands. To make matters worse, it’s alarmingly easy to pull off: if the agent visits an attacker’s site or the user clicks a malicious link, the primary authentication token is leaked. With that token in hand, the attacker has full administrative control over the gateway. This vulnerability was patched in version 2026.1.29.

Also, two dangerous command injection vulnerabilities (CVE-2026-24763 and CVE-2026-25157) were discovered.

Insecure defaults and features

A variety of default settings and implementation quirks make attacking the gateway a walk in the park:

Authentication is disabled by default, so the gateway is accessible from the internet.
The server accepts WebSocket connections without verifying their origin.
Localhost connections are implicitly trusted, which is a disaster waiting to happen if the host is running a reverse proxy.
Several tools — including some dangerous ones — are accessible in Guest Mode.
Critical configuration parameters leak across the local network via mDNS broadcast messages.

Secrets in plaintext

OpenClaw’s configuration, “memory”, and chat logs store API keys, passwords, and other credentials for LLMs and integration services in plain text. This is a critical threat — to the extent that versions of the RedLine and Lumma infostealers have already been spotted with OpenClaw file paths added to their must-steal lists. Also, the Vidar infostealer was caught stealing secrets from OpenClaw.

Malicious skills

OpenClaw’s functionality can be extended with “skills” available in the ClawHub repository. Since anyone can upload a skill, it didn’t take long for threat actors to start “bundling” the AMOS macOS infostealer into their uploads. Within a short time, the number of malicious skills reached the hundreds. This prompted developers to quickly ink a deal with VirusTotal to ensure all uploaded skills aren’t only checked against malware databases, but also undergo code and content analysis via LLMs. That said, the authors are very clear: it’s no silver bullet.

Structural flaws in the OpenClaw AI agent

Vulnerabilities can be patched and settings can be hardened, but some of OpenClaw’s issues are fundamental to its design. The product combines several critical features that, when bundled together, are downright dangerous:

OpenClaw has privileged access to sensitive data on the host machine and the owner’s personal accounts.
The assistant is wide open to untrusted data: the agent receives messages via chat apps and email, autonomously browses web pages, etc.
It suffers from the inherent inability of LLMs to reliably separate commands from data, making prompt injection a possibility.
The agent saves key takeaways and artifacts from its tasks to inform future actions. This means a single successful injection can poison the agent’s memory, influencing its behavior long-term.
OpenClaw has the power to talk to the outside world — sending emails, making API calls, and utilizing other methods to exfiltrate internal data.

It’s worth noting that while OpenClaw is a particularly extreme example, this “Terrifying Five” list is actually characteristic of almost all multi-purpose AI agents.

OpenClaw risks for organizations

If an employee installs an agent like this on a corporate device and hooks it into even a basic suite of services (think Slack and SharePoint), the combination of autonomous command execution, broad file system access, and excessive OAuth permissions creates fertile ground for a deep network compromise. In fact, the bot’s habit of hoarding unencrypted secrets and tokens in one place is a disaster waiting to happen — even if the AI agent itself is never compromised.

On top of that, these configurations violate regulatory requirements across multiple countries and industries, leading to potential fines and audit failures. Current regulatory requirements, like those in the EU AI Act or the NIST AI Risk Management Framework, explicitly mandate strict access control for AI agents. OpenClaw’s configuration approach clearly falls short of those standards.

But the real kicker is that even if employees are banned from installing this software on work machines, OpenClaw can still end up on their personal devices. This also creates specific risks for given the organization as a whole:

Personal devices frequently store access to work systems like corporate VPN configs or browser tokens for email and internal tools. These can be hijacked to gain a foothold in the company’s infrastructure.
Controlling the agent via chat apps means that it’s not just the employee that becomes a target for social engineering, but also their AI agent, seeing AI account takeovers or impersonation of the user in chats with colleagues (among other scams) become a reality. Even if work is only occasionally discussed in personal chats, the info in them is ripe for the picking.
If an AI agent on a personal device is hooked into any corporate services (email, messaging, file storage), attackers can manipulate the agent to siphon off data, and this activity would be extremely difficult for corporate monitoring systems to spot.

How to detect OpenClaw

Depending on the SOC team’s monitoring and response capabilities, they can track OpenClaw gateway connection attempts on personal devices or in the cloud. Additionally, a specific combination of red flags can indicate OpenClaw’s presence on a corporate device:

Look for ~/.openclaw/, ~/clawd/, or ~/.clawdbot directories on host machines.
Scan the network with internal tools, or public ones like Shodan, to identify the HTML fingerprints of Clawdbot control panels.
Monitor for WebSocket traffic on ports 3000 and 18789.
Keep an eye out for mDNS broadcast messages on port 5353 (specifically openclaw-gw.tcp).
Watch for unusual authentication attempts in corporate services, such as new App ID registrations, OAuth Consent events, or User-Agent strings typical of Node.js and other non-standard user agents.
Look for access patterns typical of automated data harvesting: reading massive chunks of data (scraping all files or all emails) or scanning directories at fixed intervals during off-hours.

Controlling shadow AI

A set of security hygiene practices can effectively shrink the footprint of both shadow IT and shadow AI, making it much harder to deploy OpenClaw in an organization:

Use host-level allowlisting to ensure only approved applications and cloud integrations are installed. For products that support extensibility (like Chrome extensions, VS Code plugins, or OpenClaw skills), implement a closed list of vetted add-ons.
Conduct a full security assessment of any product or service, AI agents included, before allowing them to hook into corporate resources.
Treat AI agents with the same rigorous security requirements applied to public-facing servers that process sensitive corporate data.
Implement the principle of least privilege for all users and other identities.
Don’t grant administrative privileges without a critical business need. Require all users with elevated permissions to use them only when performing specific tasks rather than working from privileged accounts all the time.
Configure corporate services so that technical integrations (like apps requesting OAuth access) are granted only the bare minimum permissions.
Periodically audit integrations, OAuth tokens, and permissions granted to third-party apps. Review the need for these with business owners, proactively revoke excessive permissions, and kill off stale integrations.

Secure deployment of agentic AI

If an organization allows AI agents in an experimental capacity — say, for development testing or efficiency pilots — or if specific AI use cases have been greenlit for general staff, robust monitoring, logging, and access control measures should be implemented:

Deploy agents in an isolated subnet with strict ingress and egress rules, limiting communication only to trusted hosts required for the task.
Use short-lived access tokens with a strictly limited scope of privileges. Never hand an agent tokens that grant access to core company servers or services. Ideally, create dedicated service accounts for every individual test.
Wall off the agent from dangerous tools and data sets that aren’t relevant to its specific job. For experimental rollouts, it’s best practice to test the agent using purely synthetic data that mimics the structure of real production data.
Configure detailed logging of the agent’s actions. This should include event logs, command-line parameters, and chain-of-thought artifacts associated with every command it executes.
Set up SIEM to flag abnormal agent activity. The same techniques and rules used to detect LotL attacks are applicable here, though additional efforts to define what normal activity looks like for a specific agent are required.
If MCP servers and additional agent skills are used, scan them with the security tools emerging for these tasks, such as skill-scanner, mcp-scanner, or mcp-scan. Specifically for OpenClaw testing, several companies have already released open-source tools to audit the security of its configurations.

Corporate policies and employee training

A flat-out ban on all AI tools is a simple but rarely productive path. Employees usually find workarounds — driving the problem into the shadows where it’s even harder to control. Instead, it’s better to find a sensible balance between productivity and security.

Implement transparent policies on using agentic AI. Define which data categories are okay for external AI services to process, and which are strictly off-limits. Employees need to understand why something is forbidden. A policy of “yes, but with guardrails” is always received better than a blanket “no”.

Train with real-world examples. Abstract warnings about “leakage risks” tend to be futile. It’s better to demonstrate how an agent with email access can forward confidential messages just because a random incoming email asked it to. When the threat feels real, motivation to follow the rules grows too. Ideally, employees should complete a brief crash course on AI security.

Offer secure alternatives. If employees need an AI assistant, provide an approved tool that features centralized management, logging, and OAuth access control.

Kaspersky official blog
Аgentic AI security measures based on the OWASP ASI Top 10 26 January 2026 at 16:26

Аgentic AI security measures based on the OWASP ASI Top 10

Kaspersky official blog

By: Stan Kaminsky

26 January 2026 at 16:26

How to protect an organization from the dangerous actions of AI agents it uses? This isn’t just a theoretical what-if anymore — considering the actual damage autonomous AI can do ranges from providing poor customer service to destroying corporate primary databases. It’s a question business leaders are currently hammering away at, and government agencies and security experts are racing to provide answers to.

For CIOs and CISOs, AI agents create a massive governance headache. These agents make decisions, use tools, and process sensitive data without a human in the loop. Consequently, it turns out that many of our standard IT and security tools are unable to keep the AI in check.

The non-profit OWASP Foundation has released a handy playbook on this very topic. Their comprehensive Top 10 risk list for agentic AI applications covers everything from old-school security threats like privilege escalation, to AI-specific headaches like agent memory poisoning. Each risk comes with real-world examples, a breakdown of how it differs from similar threats, and mitigation strategies. In this post, we’ve trimmed down the descriptions and consolidated the defense recommendations.

The top-10 risks of deploying autonomous AI agents. Source

Agent goal hijack (ASI01)

This risk involves manipulating an agent’s tasks or decision-making logic by exploiting the underlying model’s inability to tell the difference between legitimate instructions and external data. Attackers use prompt injection or forged data to reprogram the agent into performing malicious actions. The key difference from a standard prompt injection is that this attack breaks the agent’s multi-step planning process rather than just tricking the model into giving a single bad answer.

Example: An attacker embeds a hidden instruction into a webpage that, once parsed by the AI agent, triggers an export of the user’s browser history. A vulnerability of this very nature was showcased in a EchoLeak study.

Tool misuse and exploitation (ASI02)

This risk crops up when an agent — driven by ambiguous commands or malicious influence — uses the legitimate tools it has access to in unsafe or unintended ways. Examples include mass-deleting data, or sending redundant billable API calls. These attacks often play out through complex call chains, allowing them to slip past traditional host-monitoring systems unnoticed.

Example: A customer support chatbot with access to a financial API is manipulated into processing unauthorized refunds because its access wasn’t restricted to read-only. Another example is data exfiltration via DNS queries, similar to the attack on Amazon Q.

Identity and privilege abuse (ASI03)

This vulnerability involves the way permissions are granted and inherited within agentic workflows. Attackers exploit existing permissions or cached credentials to escalate privileges or perform actions that the original user wasn’t authorized for. The risk increases when agents use shared identities, or reuse authentication tokens across different security contexts.

Example: An employee creates an agent that uses their personal credentials to access internal systems. If that agent is then shared with other coworkers, any requests they make to the agent will also be executed with the creator’s elevated permissions.

Agentic Supply Chain Vulnerabilities (ASI04)

Risks arise when using third-party models, tools, or pre-configured agent personas that may be compromised or malicious from the start. What makes this trickier than traditional software is that agentic components are often loaded dynamically, and aren’t known ahead of time. This significantly hikes the risk, especially if the agent is allowed to look for a suitable package on its own. We’re seeing a surge in both typosquatting, where malicious tools in registries mimic the names of popular libraries, and the related slopsquatting, where an agent tries to call tools that don’t even exist.

Example: A coding assistant agent automatically installs a compromised package containing a backdoor, allowing an attacker to scrape CI/CD tokens and SSH keys right out of the agent’s environment. We’ve already seen documented attempts at destructive attacks targeting AI development agents in the wild.

Unexpected code execution / RCE (ASI05)

Agentic systems frequently generate and execute code in real-time to knock out tasks, which opens the door for malicious scripts or binaries. Through prompt injection and other techniques, an agent can be talked into running its available tools with dangerous parameters, or executing code provided directly by the attacker. This can escalate into a full container or host compromise, or a sandbox escape — at which point the attack becomes invisible to standard AI monitoring tools.

Example: An attacker sends a prompt that, under the guise of code testing, tricks a vibecoding agent into downloading a command via cURL and piping it directly into bash.

Memory and context poisoning (ASI06)

Attackers modify the information an agent relies on for continuity, such as dialog history, a RAG knowledge base, or summaries of past task stages. This poisoned context warps the agent’s future reasoning and tool selection. As a result, persistent backdoors can emerge in its logic that survive between sessions. Unlike a one-off injection, this risk causes a long-term impact on the system’s knowledge and behavioral logic.

Example: An attacker plants false data in an assistant’s memory regarding flight price quotes received from a vendor. Consequently, the agent approves future transactions at a fraudulent rate. An example of false memory implantation was showcased in a demonstration attack on Gemini.

Insecure inter-agent communication (ASI07)

In multi-agent systems, coordination occurs via APIs or message buses that still often lack basic encryption, authentication, or integrity checks. Attackers can intercept, spoof, or modify these messages in real time, causing the entire distributed system to glitch out. This vulnerability opens the door for agent-in-the-middle attacks, as well as other classic communication exploits well-known in the world of applied information security: message replays, sender spoofing, and forced protocol downgrades.

Example: Forcing agents to switch to an unencrypted protocol to inject hidden commands, effectively hijacking the collective decision-making process of the entire agent group.

Cascading failures (ASI08)

This risk describes how a single error — caused by hallucination, a prompt injection, or any other glitch — can ripple through and amplify across a chain of autonomous agents. Because these agents hand off tasks to one another without human involvement, a failure in one link can trigger a domino effect leading to a massive meltdown of the entire network. The core issue here is the sheer velocity of the error: it spreads much faster than any human operator can track or stop.

Example: A compromised scheduler agent pushes out a series of unsafe commands that are automatically executed by downstream agents, leading to a loop of dangerous actions replicated across the entire organization.

Human–agent trust exploitation (ASI09)

Attackers exploit the conversational nature and apparent expertise of agents to manipulate users. Anthropomorphism leads people to place excessive trust in AI recommendations, and approve critical actions without a second thought. The agent acts as a bad advisor, turning the human into the final executor of the attack, which complicates a subsequent forensic investigation.

Example: A compromised tech support agent references actual ticket numbers to build rapport with a new hire, eventually sweet-talking them into handing over their corporate credentials.

Rogue agents (ASI10)

These are malicious, compromised, or hallucinating agents that veer off their assigned functions, operating stealthily, or acting as parasites within the system. Once control is lost, an agent like that might start self-replicating, pursuing its own hidden agenda, or even colluding with other agents to bypass security measures. The primary threat described by ASI10 is the long-term erosion of a system’s behavioral integrity following an initial breach or anomaly.

Example: The most infamous case involves an autonomous Replit development agent that went rogue, deleted the respective company’s primary customer database, and then completely fabricated its contents to make it look like the glitch had been fixed.

Mitigating risks in agentic AI systems

While the probabilistic nature of LLM generation and the lack of separation between instructions and data channels make bulletproof security impossible, a rigorous set of controls — approximating a Zero Trust strategy — can significantly limit the damage when things go awry. Here are the most critical measures.

Enforce the principles of both least autonomy and least privilege. Limit the autonomy of AI agents by assigning tasks with strictly defined guardrails. Ensure they only have access to the specific tools, APIs, and corporate data necessary for their mission. Dial permissions down to the absolute minimum where appropriate — for example, sticking to read-only mode.

Use short-lived credentials. Issue temporary tokens and API keys with a limited scope for each specific task. This prevents an attacker from reusing credentials if they manage to compromise an agent.

Mandatory human-in-the-loop for critical operations. Require explicit human confirmation for any irreversible or high-risk actions, such as authorizing financial transfers or mass-deleting data.

Execution isolation and traffic control. Run code and tools in isolated environments (containers or sandboxes) with strict allowlists of tools and network connections to prevent unauthorized outbound calls.

Policy enforcement. Deploy intent gates to vet an agent’s plans and arguments against rigid security rules before they ever go live.

Input and output validation and sanitization. Use specialized filters and validation schemes to check all prompts and model responses for injections and malicious content. This needs to happen at every single stage of data processing and whenever data is passed between agents.

Continuous secure logging. Record every agent action and inter-agent message in immutable logs. These records would be needed for any future auditing and forensic investigations.

Behavioral monitoring and watchdog agents. Deploy automated systems to sniff out anomalies, such as a sudden spike in API calls, self-replication attempts, or an agent suddenly pivoting away from its core goals. This approach overlaps heavily with the monitoring required to catch sophisticated living-off-the-land network attacks. Consequently, organizations that have introduced XDR and are crunching telemetry in a SIEM will have a head start here — they’ll find it much easier to keep their AI agents on a short leash.

Supply chain control and SBOMs (software bills of materials). Only use vetted tools and models from trusted registries. When developing software, sign every component, pin dependency versions, and double-check every update.

Static and dynamic analysis of generated code. Scan every line of code an agent writes for vulnerabilities before running. Ban the use of dangerous functions like eval() completely. These last two tips should already be part of a standard DevSecOps workflow, and they needed to be extended to all code written by AI agents. Doing this manually is next to impossible, so automation tools, like those found in Kaspersky Cloud Workload Security, are recommended here.

Securing inter-agent communications. Ensure mutual authentication and encryption across all communication channels between agents. Use digital signatures to verify message integrity.

Kill switches. Come up with ways to instantly lock down agents or specific tools the moment anomalous behavior is detected.

Using UI for trust calibration. Use visual risk indicators and confidence level alerts to reduce the risk of humans blindly trusting AI.

User training. Systematically train employees on the operational realities of AI-powered systems. Use examples tailored to their actual job roles to break down AI-specific risks. Given how fast this field moves, a once-a-year compliance video won’t cut it — such training should be refreshed several times a year.

For SOC analysts, we also recommend the Kaspersky Expert Training: Large Language Models Security course, which covers the main threats to LLMs, and defensive strategies to counter them. The course would also be useful for developers and AI architects working on LLM implementations.

Black Hills Information Security, Inc.
Getting Started with AI Hacking: Part 1 2 April 2025 at 16:00

Getting Started with AI Hacking: Part 1

Black Hills Information Security, Inc.

By: BHIS

2 April 2025 at 16:00

Getting Started with AI Hacking

You may have read some of our previous blog posts on Artificial Intelligence (AI). We discussed things like using PyRIT to help automate attacks. We also covered the dangers of […]

The post Getting Started with AI Hacking: Part 1 appeared first on Black Hills Information Security, Inc..

Black Hills Information Security, Inc.
Avoiding Dirty RAGs: Retrieval-Augmented Generation with Ollama and LangChain 20 February 2025 at 16:33

Avoiding Dirty RAGs: Retrieval-Augmented Generation with Ollama and LangChain

Black Hills Information Security, Inc.

By: BHIS

20 February 2025 at 16:33

RAG connects pre-trained LLMs with current data sources. Moreover, a RAG system can use many data sources.

The post Avoiding Dirty RAGs: Retrieval-Augmented Generation with Ollama and LangChain appeared first on Black Hills Information Security, Inc..

Black Hills Information Security, Inc.
Pitting AI Against AI: Using PyRIT to Assess Large Language Models (LLMs) 19 December 2024 at 16:06

Pitting AI Against AI: Using PyRIT to Assess Large Language Models (LLMs)

Black Hills Information Security, Inc.

By: BHIS

19 December 2024 at 16:06

Many people have heard of ChatGPT, Gemini, Bart, Claude, Llama, or other artificial intelligence (AI) assistants at this point. These are all implementations of what are known as large language […]

The post Pitting AI Against AI: Using PyRIT to Assess Large Language Models (LLMs) appeared first on Black Hills Information Security, Inc..

Black Hills Information Security, Inc.
PODCAST: Raising Hacker Kids 4 December 2018 at 18:32

PODCAST: Raising Hacker Kids

Black Hills Information Security, Inc.

By: BHIS

4 December 2018 at 18:32

Yes.. Ethical Hacker Kids. The holidays are coming up! Here John & Jordan cover the different games, tools and gifts we can give kids that help teach them the trade. […]

The post PODCAST: Raising Hacker Kids appeared first on Black Hills Information Security, Inc..

💾

Black Hills Information Security, Inc.
WEBCAST: Raising Hacker Kids 4 December 2018 at 18:31

WEBCAST: Raising Hacker Kids

Black Hills Information Security, Inc.

By: BHIS

4 December 2018 at 18:31

John Strand & Jordan Drysdale// Yes.. Ethical Hacker Kids. The holidays are coming up! Here John & Jordan cover the different games, tools and gifts we can give kids that […]

The post WEBCAST: Raising Hacker Kids appeared first on Black Hills Information Security, Inc..

Black Hills Information Security, Inc.
GNU Screen Quick Reference 1 February 2017 at 18:49

GNU Screen Quick Reference

Black Hills Information Security, Inc.

By: BHIS

1 February 2017 at 18:49

Brian King // I use GNU Screen mainly to prevent processes from dying when I disconnect from an SSH session, but GNU Screen can do a whole lot more than that […]

The post GNU Screen Quick Reference appeared first on Black Hills Information Security, Inc..

Black Hills Information Security, Inc.
The Courage to Learn 18 April 2016 at 15:52

The Courage to Learn

Black Hills Information Security, Inc.

By: BHIS

18 April 2016 at 15:52

Sierra Ward // Last year I listened to a podcast* from Freakonomics that has stuck with me – in fact, I think it’s changed the way I think – powerful stuff […]

The post The Courage to Learn appeared first on Black Hills Information Security, Inc..

Normal view

How to detect and restrict ChatGPT

How to detect and restrict Claude and Claude Code

How to detect and restrict Perplexity AI

How to detect and restrict DeepSeek

How to detect and restrict Mistral, xAI Grok, and Character.ai

How to detect and restrict Slack AI

How to detect and restrict Zoom AI Companion

How to detect and restrict Grammarly

How to detect and restrict meeting assistants: Fireflies, Read.ai, Tactiq, Fathom, and Granola

How to detect and restrict AI code editors: Cursor, Windsurf, and the like

How to detect and restrict local AI tools: Ollama, LM Studio, and GPT4All

How to detect and restrict autonomous agents: OpenClaw, NemoClaw, and NanoClaw

How to turn off Microsoft 365 Copilot

How to turn off Windows Copilot

How to turn off the Copilot sidebar in Edge

How to turn off the Gemini Assistant in Google Workspace

How to turn off Gemini in Google Chrome

How to turn off Apple Intelligence

Enabling customer control and choice across the AI stack

Supporting national AI strategies

Verifiable control over data access

Transparency and assurance

Sustained commitment to helping customers achieve their sovereignty goals

Statistics from a honeypot

What are the attackers after?

Conclusions and defense tips

The primary risks of AI-generated code

Core principles of securing vibe code

Tips for securing vibe code

A new class of AI for cybersecurity

AWS architects services with security at the core

How to get started today

The threat landscape isn’t waiting

Learn More

Known OpenClaw issues

OpenClaw vulnerabilities

Insecure defaults and features

Secrets in plaintext

Malicious skills

Structural flaws in the OpenClaw AI agent

OpenClaw risks for organizations

How to detect OpenClaw

Controlling shadow AI

Secure deployment of agentic AI

Corporate policies and employee training

Agent goal hijack (ASI01)

Tool misuse and exploitation (ASI02)

Identity and privilege abuse (ASI03)

Agentic Supply Chain Vulnerabilities (ASI04)

Unexpected code execution / RCE (ASI05)

Memory and context poisoning (ASI06)

Insecure inter-agent communication (ASI07)

Cascading failures (ASI08)

Human–agent trust exploitation (ASI09)

Rogue agents (ASI10)

Mitigating risks in agentic AI systems