Skip to main content

OpenClaw Security Analysis → B4M CLI Design Principles

What this is. A teardown of OpenClaw's architectural security failures and the design principles we're committing to instead for B4M CLI. This is foundational reasoning that should shape every agent-runtime decision we make.

Related. Connects directly to the B4M V3 Blueprint work — the same trust-boundary thinking shows up in how V3 packages and blueprints separate stable infrastructure (security must be identical everywhere) from product-specific business logic.

The fundamental architecture problem

OpenClaw runs as a single long-lived Node.js process (the “Gateway”) that handles everything: channel connections, session state, the agent loop, model calls, tool execution, and memory persistence. This is a monolith with god-mode permissions. Every integration, every skill, every message flows through one trust boundary.

The result: anyone who can influence the agent's input effectively inherits the agent's permissions. Sophos coined this the “lethal trifecta”:

  1. Access to private data
  2. Ability to communicate externally
  3. Ability to ingest untrusted content

When all three exist in the same execution context, you are a single prompt injection away from full compromise.


The five critical security failures

1. The web server / gateway exposure problem

OpenClaw's Gateway binds to localhost and treats 127.0.0.1 connections as trusted (no auth required). But when deployed behind a reverse proxy (as most people do for remote access), all external requests get forwarded as localhost traffic, bypassing authentication entirely.

  • CVE-2026-25253 (CVSS 8.8): The Control UI accepted a gatewayUrl query parameter from the URL without validation, auto-initiating a WebSocket connection and transmitting the auth token. Three-stage attack chain completes in milliseconds: token exfil → authenticated connection → full RCE.
  • 40,000+ instances exposed on the internet (SecurityScorecard), 63% vulnerable, 12,812 exploitable via RCE.
  • API keys and credentials leaked in plaintext via control panels.

This is exactly the right instinct: why does an autonomous agent need a web server? The answer is: it doesn't. The web server exists for the Control UI and webhook callbacks from messaging platforms. Both can be handled without binding a general-purpose HTTP server.

2. Prompt injection as a structural, unsolved problem

OpenClaw's “guardrails” are prompt instructions — soft guidance that any sufficiently crafted input can override. This is not a bug; it's a category error.

  • Demonstrated attacks: an email containing a prompt injection caused the agent to exfiltrate private keys from the host machine.
  • A user sent themselves an email that caused the bot to forward victim emails to the attacker — no confirmation, no approval.
  • The agent can be steered by malicious content in anything it reads: web pages, emails, documents, images, metadata.
  • Microsoft's assessment: “Assume the runtime can be influenced by untrusted input, its state can be modified, and the host system can be exposed.”

The fix isn't better prompts. It's architectural: separate the reasoning layer from the execution layer with hard boundaries.

3. Skills supply chain poisoning (ClawHavoc campaign)

OpenClaw's skill system (community-contributed SKILL.md files with YAML + natural language) has become a malware distribution vector.

  • 800+ malicious skills discovered (~20% of the ClawHub registry).
  • Top downloaded community skill: literal malware.
  • Skills can execute arbitrary shell commands, exfiltrate data silently via curl to external servers, and inject prompts that bypass safety guidelines.
  • No code signing, no sandboxing, no review process.
  • The “What Would Elon Do?” skill: functionally malware delivering Atomic macOS Stealer (AMOS).

4. No trust boundary isolation

  • Single-process architecture means channel adapters, session management, tool execution, and memory all share one trust domain.
  • No per-user isolation — cross-session data leakage documented.
  • sessionKey is routing/context selection, not per-user auth.
  • Memory persists as plaintext Markdown files on disk, accumulating sensitive data over time.
  • If you mix personal and company identities on one runtime, all separation collapses.

5. Excessive default permissions

  • Full shell access, file read/write, browser control, email access, calendar management — all enabled by default or trivially enabled.
  • Exec approvals are opt-in prompt-based guardrails, not hard architectural boundaries.
  • Sandbox mode is opt-in; if off, commands execute directly on the gateway host.
  • The agent can write its own “vibe code” for tasks it doesn't know — meaning it can self-extend its capabilities at runtime.

What B4M CLI does differently

The six design principles we are committing to.

Principle 1: No web server for agent coordination

Replace the HTTP Gateway pattern with:

  • Unix domain sockets or named pipes for local IPC — no network-bindable surface.
  • Outbound-only connections to messaging platforms (polling or long-poll, not webhook callbacks requiring an open port).
  • If a UI is needed, use a Unix socket ↔ browser bridge that cannot be proxied externally.
  • Zero listening ports by default. Nothing binds to any interface.

Principle 2: Hard execution boundaries (not prompt guardrails)

  • Separate processes for reasoning (LLM interaction) and execution (tool use).
  • The reasoning process produces a structured action request (JSON/YAML intent).
  • The execution process validates against a compiled allowlist (not prompt instructions).
  • High-risk actions (shell exec, file write, email send) require cryptographic approval tokens, not chat-based “reply Y.”
  • Consider capability-based security: the agent only has access to explicitly granted capabilities, not ambient authority.

Principle 3: Untrusted content quarantine

  • All external content (emails, web pages, documents) processed by a read-only, tool-disabled summarizer agent first.
  • Summaries passed to the main agent — never raw untrusted content.
  • This creates an air gap against indirect prompt injection.
  • Content provenance tracking: the agent knows what came from trusted vs. untrusted sources.

Principle 4: No ambient credentials

  • No API keys in environment variables inherited by child processes.
  • Use short-lived, scoped tokens minted per-action.
  • Credentials stored in an OS-level keychain or hardware security module, not plaintext config.
  • The agent process itself never holds persistent credentials.

Principle 5: Deterministic skill / extension model

  • Skills are code-reviewed, signed, and sandboxed — not community YAML that runs arbitrary commands.
  • Skills execute in isolated containers/processes with explicitly declared capabilities.
  • No self-extending: the agent cannot write and execute new code at runtime without human approval through an out-of-band channel.
  • Skill registry with cryptographic attestation, not a wiki anyone can edit.

Principle 6: Decentralized by design

  • Each user runs their own agent instance with their own trust boundary.
  • No shared runtime, no shared memory, no shared credentials.
  • Coordination between agents happens via message passing with verified identities, not shared process space.
  • This is the mycelial model: autonomous nodes communicating through defined interfaces, not a centralized gateway.

The core insight

OpenClaw proved that people desperately want autonomous AI agents. The demand is real. But its architecture treats security as a configuration problem (“just set the right flags”) rather than a structural one.

Bike4Mind CLI's opportunity: deliver the same autonomy with security as an architectural property, not a configuration option.

The web server isn't just unnecessary — it's the symptom of a design philosophy that treats the agent as a service to be exposed rather than a capability to be contained. An agent that coordinates via local IPC, connects outbound-only, quarantines untrusted input, and enforces hard execution boundaries can be just as powerful without being a sitting duck.