Is using Baileys for WhatsApp legal?

Baileys implements the multi-device WhatsApp protocol — the same protocol the official WhatsApp Web client uses. It's not officially supported by Meta, but it's not against the WhatsApp Terms in personal/SMB use. For high-volume commercial use at scale, switch to the official Cloud API or a BSP partner. Baileys is the fast path for development and small-to-medium production.

How many WhatsApp accounts can one OpenClawMU gateway handle?

One Baileys connection per WhatsApp account, and one account per tenant. Hundreds of paired accounts on a single gateway is feasible — the bottleneck is more about WhatsApp's own per-account rate limits than the gateway. Empirically, a 4-vCPU VM comfortably runs 200+ paired accounts.

Can I migrate from Twilio / Vonage / Meta Cloud API?

Yes — write a custom channel adapter (~200 lines) that translates between the upstream API and OpenClawMU's normalized envelope format. Sessions, memory, and tools are unchanged. The channel is the only swap.

Cornerstone

Self-hosted WhatsApp bot platform: the open-source playbook

Run a multi-tenant WhatsApp bot platform on your own infra without the WhatsApp Business API contract. Open-source, BYO-LLM, with per-customer isolation and cost tracking. Here's the architecture.

By Dipankar Sarkar June 2, 2026 4 min read View raw .md

WhatsApp
bot platform
self-hosted
Baileys
open source

You want to run a WhatsApp bot platform. Either as a SaaS product for customers, or as the messaging layer for your own product. The path most people take is signing up for the WhatsApp Business API via a BSP (Twilio, Vonage, 360dialog) — which works, but binds you to per-message fees, a contract, and their infrastructure.

There’s a self-hosted alternative that’s a much better fit for many use cases: run your own multi-tenant gateway on a VM you control, connect each tenant to their WhatsApp via the multi-device protocol, and pay only for your LLM provider and your VM.

Why self-host?

No per-message fee. WhatsApp Cloud API charges per conversation; self-hosted via Baileys has no marginal cost beyond your VM.
Bring your own LLM. Anthropic, OpenAI, Llama, Mistral — your choice. The cloud bot platforms typically lock you to one.
Data residency. Conversations stay on your hardware in your region.
Customization. Drop in any tool, any prompt, any agent behavior. No proprietary flow language.
Per-customer billing. Meter each tenant’s LLM cost and charge them what makes sense for your business.

The trade-offs are real: you operate the gateway, you handle the QR-code re-pairing when WhatsApp deauthorizes a session, and Baileys is unofficial (so a particularly hostile Meta policy change could break it). For most SMB use cases, the trade-offs land in your favor.

The stack

A self-hosted WhatsApp bot platform needs four things:

A WhatsApp adapter. Baileys is the standard for the multi-device protocol.
An agent runtime. Something that takes an inbound message and produces a reply, with tool-use, memory, and personality.
Tenant isolation. Each customer’s conversations, memory, and credentials kept separate.
Cost accounting. Per-tenant token tracking so you can bill rationally.

OpenClawMU bundles all four. The flow:

WhatsApp ──Baileys──► OpenClawMU ──tenant-routed──► Agent runtime
                          │                              │
                          ├── per-tenant session store ──┘
                          ├── per-tenant memory (sqlite-vec)
                          ├── per-tenant sandbox
                          └── per-tenant cost accounting

Pairing a tenant’s WhatsApp

The CLI walks the QR-code dance. The end-user’s phone scans the QR; Baileys negotiates the device-paired session; the credentials are stored in the tenant’s directory.

openclaw channels pair whatsapp --tenant acme
# → scans QR; on success, /tenants/acme/channels/whatsapp.json is written

Once paired, inbound messages from that WhatsApp account route to the acme tenant’s agent. The agent’s reply is sent back through the same Baileys session.

Inbound message flow

Every inbound is normalized into a tenant-tagged envelope:

{
  "tenant": "acme",
  "channel": "whatsapp",
  "user": {
    "id": "wa:+15551234567",
    "display_name": "Jane Doe"
  },
  "session_id": "wa:+15551234567:default",
  "content": { "type": "text", "text": "How many invoices are overdue?" },
  "received_at": "2026-06-03T10:14:22Z"
}

The agent runtime processes this envelope, executes whatever tools it needs (looking up the invoice DB, etc.), and produces a reply. The reply goes back to the Baileys adapter, which translates it into WhatsApp-native form (markdown → text formatting, line breaks preserved) and sends it.

Handling media

WhatsApp messages can include images, videos, voice notes, documents. Each gets normalized into a content block with a type and a (locally-stored) path:

image → routed to a vision-capable model (Claude Opus, GPT-4o).
voice → transcribed via Whisper (local or API), then treated as text.
document → text-extracted via pdfjs / docx / etc., then included as context.

Outbound media is symmetric: the agent can attach an image (e.g., a generated chart) and the adapter uploads it via WhatsApp’s media endpoints.

Cost accounting

Every LLM call records a billing row scoped to the tenant. At the end of the month, generate a CSV:

openclaw billing report acme --period current-month --csv > acme-2026-06.csv

Pipe that into Stripe Billing, QuickBooks, or your own invoicing flow. The customer sees an itemized usage statement; you pocket the margin over your LLM provider’s cost.

Reliability concerns

Session expiry. WhatsApp will occasionally invalidate a multi-device session. The fix is to re-pair the QR. Build a re-pair UX for your customers to handle this without paging your support team.
Rate limits. WhatsApp throttles per-account; respect their guidance on message-send rates.
Backups. openclaw tenants backup acme --to s3://... snapshots the full tenant state, including the WhatsApp credentials. Schedule nightly.
Multi-region resilience. Run a hot-standby gateway in a second region with cross-region S3 replication. RTO ~10 minutes via restore.

When not to self-host

Very high volume. Above a few thousand messages/day per account, the official WhatsApp Cloud API or a BSP becomes operationally cleaner.
Regulated industries with strict approval flows. Healthcare, banking, and some government contexts require the official API (button-style templates, opt-in flows).
You don’t want to operate a VM. Run the gateway via a managed hosting partner instead. (Hosted-ops contracts available — see /pricing.)

The stack, end-to-end

VM: Hetzner CCX13 ($35/mo) or AWS t3.medium ($30/mo).
OpenClawMU: Apache-2.0, self-hosted.
LLM: Anthropic, OpenAI, or local Llama / Mistral.
TLS / public URL: Tailscale Funnel (free), Cloudflare Tunnel, or your own nginx.
Backups: S3, R2, or MinIO.
Monitoring: any Prometheus scraper for /metrics; any log forwarder for the audit log.

Total fixed cost: $50–80/month depending on VM choice. Variable cost: your LLM bill, which you can pass through to your customers with margin.

That’s the entire playbook. The platform is free; the LLM you pay for; the customers you charge.

Frequently asked

Is using Baileys for WhatsApp legal?: Baileys implements the multi-device WhatsApp protocol — the same protocol the official WhatsApp Web client uses. It's not officially supported by Meta, but it's not against the WhatsApp Terms in personal/SMB use. For high-volume commercial use at scale, switch to the official Cloud API or a BSP partner. Baileys is the fast path for development and small-to-medium production.
How many WhatsApp accounts can one OpenClawMU gateway handle?: One Baileys connection per WhatsApp account, and one account per tenant. Hundreds of paired accounts on a single gateway is feasible — the bottleneck is more about WhatsApp's own per-account rate limits than the gateway. Empirically, a 4-vCPU VM comfortably runs 200+ paired accounts.
Can I migrate from Twilio / Vonage / Meta Cloud API?: Yes — write a custom channel adapter (~200 lines) that translates between the upstream API and OpenClawMU's normalized envelope format. Sessions, memory, and tools are unchanged. The channel is the only swap.