Building Your Own OpenClaw: A Technical Deep Dive

Building Your Own OpenClaw: A Technical Deep Dive

Personal AI assistants have evolved from novelty to necessity. With over 2 billion users on WhatsApp and nearly 1 billion on Telegram as of 2025, messaging platforms have become the de facto interface for daily communication. OpenClaw represents a paradigm shift: a self-hosted, multi-channel AI assistant gateway that runs on your hardware, under your control. This technical deep dive explores the architecture, integration patterns, and security considerations for building your own AI assistant from scratch.

Understanding the Gateway Architecture

At its core, OpenClaw operates as a WebSocket-based control plane that orchestrates sessions, routing, and channel connections from a single Gateway process. According to the official documentation, the Gateway serves as the “single source of truth” for all system state, eliminating distributed coordination complexity [OpenClaw Docs].

The architecture follows three fundamental principles:

  • Self-hosted infrastructure: Everything runs on your machine or server, ensuring data sovereignty
  • Multi-channel abstraction: One Gateway simultaneously serves WhatsApp, Telegram, Discord, Slack, iMessage, and 10+ additional platforms
  • Agent-native design: Built specifically for coding agents with tool use, persistent sessions, memory management, and multi-agent routing

The Gateway’s RPC mode enables the embedded Pi agent runtime to execute with tool streaming and block streaming capabilities, providing real-time feedback during long-running operations [OpenClaw Agent Runtime Docs].

Session Management: The Heart of Continuity

OpenClaw’s session model implements sophisticated state management across channels:

// Example session configuration
{
  session: {
    dmScope: "per-channel-peer", // Isolate DM context per user
    reset: {
      mode: "daily",
      atHour: 4,
      idleMinutes: 120
    },
    identityLinks: {
      "alice": ["telegram:123456789", "discord:987654321012345678"]
    }
  }
}

Sessions are stored as JSONL transcripts at ~/.openclaw/agents/<agentId>/sessions/*.jsonl, enabling persistent conversation history across restarts. The system supports four DM scoping modes: main (shared continuity), per-peer (by sender), per-channel-peer (recommended for multi-user), and per-account-channel-peer (for multi-account setups) [OpenClaw Session Docs].

WhatsApp Integration: The Baileys Foundation

WhatsApp integration leverages Baileys, a WebSocket-based TypeScript library for interacting with the WhatsApp Web API. As of version 7.0.0, Baileys introduced breaking changes requiring migration for existing implementations [Baileys GitHub].

The integration handles:

  • QR code pairing: Users scan a code from their WhatsApp mobile app
  • Credential persistence: Session data stored at ~/.openclaw/credentials/whatsapp/<id>/creds.json
  • Group message routing: Mention-based activation with context injection

For group chats, OpenClaw implements an activation mode system:

{
  channels: {
    whatsapp: {
      groups: {
        "*": { requireMention: true }
      }
    }
  },
  agents: {
    list: [{
      id: "main",
      groupChat: {
        historyLimit: 50,
        mentionPatterns: ["@?openclaw", "\\+?15555550123"]
      }
    }]
  }
}

This configuration ensures the bot only responds when explicitly mentioned, with up to 50 recent messages injected as context [OpenClaw Group Messages Docs].

Telegram Bot Integration via grammY

Telegram integration uses grammY, a modern bot framework that emphasizes developer experience and scalability. The framework supports Node.js, Deno, and Cloudflare Workers through Web API-compatible builds [grammY GitHub].

Basic bot setup requires minimal code:

import { Bot } from "grammy";

const bot = new Bot("<BOT_TOKEN>");

bot.on("message:text", (ctx) => ctx.reply("Echo: " + ctx.message.text));

bot.start();

Telegram’s Bot API version 9.4 (February 2026) introduced several powerful features for bot developers. Bots can now send custom emoji in messages, create and manage topics in private chats via createForumTopic, apply visual styling to inline keyboard buttons using the style field, and programmatically manage their own profile photos [Telegram Bot API Docs]. OpenClaw leverages these capabilities for rich message formatting and enhanced user interactions.

Key Telegram Features Supported:

  • Long polling and webhook modes
  • Inline keyboard interactions with customizable button styles
  • Message threading in supergroups and private chat topics
  • File and media handling
  • Rate limiting compliance (approximately 20 messages/minute to the same group)

Security Architecture: Defense in Depth

Running an AI agent with shell access requires careful security consideration. OpenClaw implements a layered defense model addressing the OWASP Top 10 2025 risks, particularly A01:2025 (Broken Access Control) and A05:2025 (Injection) [OWASP Top 10 2025].

DM Access Control Model

OpenClaw enforces four DM policy modes before processing messages:

  1. pairing (default): Unknown senders receive a pairing code; messages ignored until approved
  2. allowlist: Unknown senders blocked without pairing handshake
  3. open: Public access (requires explicit allowFrom: ["*"] configuration)
  4. disabled: Inbound DMs ignored entirely

The pairing workflow caps pending requests at 3 per channel by default, with codes expiring after 1 hour. Administrators approve via CLI:

openclaw pairing list whatsapp
openclaw pairing approve whatsapp ABC123

Security Audit Command

OpenClaw includes a built-in security auditor:

openclaw security audit
openclaw security audit --deep
openclaw security audit --fix

The audit checks:

  • Inbound access policies (DM/group allowlists)
  • Tool blast radius (elevated tools in open rooms)
  • Network exposure (Gateway bind/auth, Tailscale configuration)
  • Browser control exposure (remote node access)
  • Local disk hygiene (permissions, symlinks, synced folders)
  • Plugin trust boundaries

Running --fix automatically tightens configurations:

  • Changes groupPolicy="open" to groupPolicy="allowlist"
  • Re-enables logging.redactSensitive="tools"
  • Sets ~/.openclaw permissions to 700, config files to 600 [OpenClaw Security Docs]

Prompt Injection Mitigation

The threat model assumes the AI can execute shell commands, read/write files, and access network services. OpenClaw’s stance prioritizes access control before intelligence:

  1. Identity first: DM pairing and allowlists determine who can talk to the bot
  2. Scope next: Group allowlists, mention gating, tool sandboxing limit action radius
  3. Model last: Assume the model can be manipulated; design limited blast radius

Slash commands and directives are honored only for authorized senders, derived from channel allowlists plus commands.useAccessGroups configuration.

Multi-Agent Orchestration

OpenClaw supports routing inbound channels, accounts, and peers to isolated agents through workspace separation. Each agent maintains independent:

  • Session stores (~/.openclaw/agents/<agentId>/sessions/)
  • Auth profiles (~/.openclaw/agents/<agentId>/agent/auth-profiles.json)
  • Workspace directories for tool execution

Configuration example for multi-agent routing:

{
  agents: {
    list: [
      {
        id: "personal",
        model: "anthropic/claude-sonnet-4-20250514",
        workspace: "/home/user/.openclaw/workspaces/personal"
      },
      {
        id: "work",
        model: "openai/gpt-4o",
        workspace: "/home/user/.openclaw/workspaces/work",
        groupChat: {
          mentionPatterns: ["@workbot"]
        }
      }
    ]
  }
}

The Gateway routes messages based on channel + peer combinations, maintaining isolation between agent contexts while allowing cross-agent coordination through the control plane.

Model Provider Integration

OpenClaw supports OAuth-based subscriptions for major providers:

  • Anthropic: Claude Pro/Max with Claude Opus 4 for long-context strength
  • OpenAI: ChatGPT/Codex integration
  • OpenRouter: Multi-model access through unified API

The documentation recommends Anthropic Pro/Max for “better prompt-injection resistance” due to model-level instruction hardening [OpenClaw GitHub]. Model configuration supports:

  • Profile rotation (OAuth vs API keys)
  • Fallback chains for high-availability
  • Per-session model overrides via /new <model>

Deployment Best Practices

Local Development

npm install -g openclaw@latest
openclaw onboard --install-daemon
openclaw gateway --port 18789 --verbose

The daemon install creates a launchd (macOS) or systemd (Linux) user service for automatic startup.

Production Considerations

  1. Reverse proxy configuration: Set gateway.trustedProxies for proper client IP detection behind nginx/Caddy
  2. HTTPS requirement: Control UI requires secure context for device identity; use Tailscale Serve for remote access
  3. Credential management: Store WhatsApp/Telegram tokens outside workspace; use environment variables
  4. Backup strategy: Session JSONL files and credential directories require regular backup

NIST CSF 2.0 Alignment

OpenClaw’s security model aligns with NIST Cybersecurity Framework 2.0 outcomes:

  • Identify: Security audit surfaces misconfigurations
  • Protect: Access control, encryption at rest, permission hardening
  • Detect: Logging, session monitoring, anomaly surfacing
  • Respond: CLI commands for immediate lockdown (openclaw pairing reject)
  • Recover: Session isolation prevents cross-user contamination [NIST CSF 2.0]

Performance and Scalability

The Gateway handles concurrent sessions through event-driven architecture:

  • WebSocket connections: Maintained per channel (WhatsApp, Telegram, Discord simultaneously)
  • Session pruning: Old tool results trimmed from in-memory context before LLM calls
  • Block streaming: Completed assistant blocks sent immediately; configurable chunking (800-1200 chars)
  • Queue modes: steer injects inbound messages mid-run; followup holds until turn completion

For high-volume deployments, the documentation recommends per-account-channel-peer scoping to prevent session bloat and maintain responsive context windows.

Conclusion

Building a personal AI assistant with OpenClaw requires understanding three interconnected layers: channel integration, session orchestration, and security boundaries. The Gateway architecture abstracts platform-specific complexities while maintaining granular control over access and execution. By following defense-in-depth principles—pairing-based authentication, allowlist enforcement, and regular security audits—you can deploy an AI assistant that responds from your pocket while respecting your privacy and infrastructure boundaries.

The combination of Baileys for WhatsApp, grammY for Telegram, and robust session management creates a foundation that scales from single-user personal assistants to multi-tenant deployments. As AI capabilities expand, the architectural decisions made today—particularly around access control and blast radius limitation—will determine whether your assistant remains a helpful tool or becomes a security liability.

Start with the onboarding wizard (openclaw onboard), lock down your DM policies, and iterate toward the feature set that matches your workflow. The lobster way awaits.