ClawdBot Fought My Security Model

OpenClaw is what happens when “AI assistant” stops being a metaphor

I spun up OpenClaw in the cloud over the last 24 hours, on purpose.

Not because I like paying for extra infrastructure, but because I didn’t want an autonomous agent anywhere near my actual laptop, my day-to-day browser profile, or the account access and API keys that would ruin my week if they leaked. I wanted separation: a disposable environment, segmented networking, clean credentials, and a clear “pull the plug” option if anything started behaving strangely.

And here’s the uncomfortable lesson: OpenClaw doesn’t simply make unsafe setups possible; it keeps steering you back toward them, because the fastest way to make things “work” is usually the loosest way to wire them up.

That’s the part the hype clips skip.

OpenClaw is part AI assistant, part autonomous agent framework: systems that let software execute tasks with minimal prompting once configured. It’s open source, aimed squarely at developers and technically fluent users, and it’s routinely discussed in the same breath as other agent frameworks because it’s not trying to be a polite chat window; it’s trying to be an operator.

If you’re excited to try it, you should be. It’s one of the clearest demonstrations of where agent systems are heading. But you should also treat it like a live electrical panel, not a weekend app install.

What OpenClaw actually does, in plain terms

OpenClaw is a self-hosted agent that can be run on your hardware or on private infrastructure you control. It connects to external LLM providers (OpenAI, Anthropic, Google, and others) via plugin layers, then uses “skills” and other tools to do real work: run scripts, manipulate files, check your email, browse the web, and interact with messaging platforms like WhatsApp, Telegram, Discord, Slack, and iMessage.

It also keeps memory and context across sessions, which is great when you want continuity and awful when you don’t fully understand what it’s retaining or replaying.

The project even publishes built-in commands for cost visibility /status and /usage, because the expected model is “bring your own keys, bring your own bill”.

That's the theory, anyway

In practice, a lot of people learn the hard way that “open source” and “free” are not the same as “cheap”, and “self-hosted” is not the same as “safe”.

The rebrand was a warning sign, and it arrived early

OpenClaw has already lived through a naming saga: ClawdBot → MoltBot → OpenClaw, after Anthropic raised legal concerns about the earlier name being too close to Claude (which also tells you how quickly this space is moving, and how much of it is being assembled in public). That turbulence became a magnet for confusion, squatting, and opportunistic scams; exactly the kind of environment where even technically capable people still get tricked because they’re moving too fast.

Matteo Bisi’s security write-up frames the rapid renames as fertile ground for confusion attacks and trust failures, especially once third-party extensions and exposed instances are in the mix.

(And yes, it’s annoying that we even have to talk about name churn in a security discussion, but here we are.)

Why I chose the cloud, and why it still got messy

Running OpenClaw defensively in the cloud was meant to reduce the chance of accidental self-compromise. It did reduce one risk, local machine exposure, while creating a different one: the platform’s setup path makes secure choices costly, brittle, and surprisingly time-hungry.

Here’s what I saw, repeatedly

You try to lock things down (scoped credentials, restricted services, segmented networking), and the “happy path” breaks. Skills fail, onboarding nudges you to loosen constraints, retries kick in, and the agent starts suggesting simpler options that just happen to involve broader permissions and more open connectivity. The system is optimised to complete tasks; when it hits friction, permissive configurations look like progress.

That’s not malice. It’s optimisation. But it creates a nasty feedback loop: the agent “learns” that the easiest way forward is the least safe one, and it keeps offering that answer with a straight face.

OpenClaw’s own docs acknowledge that non-loopback binds require authentication tokens and that remote access patterns matter, which is helpful. Still, it doesn’t change the lived reality that secure deployment feels like swimming upstream.

The money problem is bigger than most people expect

I spent over USD $80 during setup and light testing across infrastructure, integrations, and usage before I got reliable “value” out of the system. That number is not a badge of honour; it’s a warning about how quickly costs show up when an agent is chattering, retrying, and dragging context forward.

And my experience lines up uncomfortably well with what’s already on record:

PierrunoYT reported spending about US$40 “not using it much” in OpenClaw issue #6445.
In discussion #1949, variableresults described token usage getting so extreme that trivial automations were effectively unworkable under certain provider constraints, with rate limits showing up even on tiny jobs. (The current thread shows people actively switching providers to cope, which tells you something about the default experience.)
beastoin, in issue #1594, traced a cost spike to a classic agent failure mode: a huge tool output (they call out a massive JSON schema dump) gets pulled into the main transcript, and suddenly every “small” follow-up forces the model to haul that baggage again, paying for it repeatedly.
There’s even a fresh issue proposing a systemic fix for unbounded session bloat and token cost blow-outs, which is about as close as you get to the community admitting “this is not a user error, this is a platform behaviour.”

This is why cost visibility features exist (/status, /usage). It’s also why they’re insufficient as a safety mechanism: you often discover the problem only after you’ve paid for it.

The scariest stories aren’t about hackers

The most unnerving incident reports in this ecosystem don’t require an attacker. They only require a mis-scoped permission and an agent that is “helpful” in the wrong direction.

One example that made the rounds is blunt: a user connected ClawdBot to iMessage and it started replying to messages without being asked, and the user was grateful it only replied to their wife before they noticed.

If you’ve ever had that moment of cold embarrassment where you send a message to the wrong chat, scale that feeling up: now it’s an autonomous agent replying across threads while you’re asleep, and you’re left trying to work out which words were yours and which weren’t.

That kind of harm isn’t theoretical. It’s social, reputational, and deeply human.

Why people go quiet after they get burned

Here’s the part Amy Edmondson helped me put words to.

In psychological safety research, one of the recurring patterns is that systems which punish caution also end up punishing honesty, because people learn very quickly that admitting a near miss makes them look incompetent rather than prudent. I’ve seen that same dynamic in security work for years: competent people clean up quietly, rotate keys, tear things down, and tell almost nobody, partly because the story sounds like a personal mistake instead of a predictable outcome of how the system is set up.

Edmondson’s language doesn’t “prove” the OpenClaw risk. It explains why we don’t get clean statistics or neat breach reports, even when the underlying behaviours are being documented in public threads and issues.

(Also: nobody wants to be the person who says, “I set up an agent and it started chatting to our clients”, even if the real failure was structural.)

So what should you do if you’re excited to try OpenClaw?

If you’re going to experiment, treat it like an early-stage agent framework, not a personal assistant.

Start with actions that are boring but protective:

Run it in a disposable environment (cloud VM, isolated project, separate accounts) and assume you will throw it away.
Use narrow credentials with short lifetimes, and plan dedicated keys and expiry as part of the experiment, not as a later tidy-up.
Turn on usage visibility immediately (/status, /usage) and watch for context growth before you bolt on cron jobs or messaging integrations.
Avoid pulling large outputs into the main chat; isolate diagnostics and schema dumps in separate sessions, because session bloat is a known cost driver.
Keep your first use case tiny, the way the people getting value tend to do a narrow automation, limited scope, no “connect everything” moment.

The people who seem to enjoy OpenClaw most aren’t treating it like a friendly assistant; they’re treating it like a script runner with a clever interface, and they’re disciplined about where it’s allowed to touch. Kristian Freeman’s “How I Use OpenClaw” is a useful example of what it looks like when someone approaches it as a controlled command console rather than a general-purpose digital employee.

Where a services partner fits (and why this is not just “user error”)

If you’re evaluating agentic AI for a team, in any industry, this isn’t a question of whether your people are smart. The people in the issue threads and HN posts are smart. The problem is that the system rewards speed, and the costs (financial and security) show up later.

This is where we help:

Assessing agent tools before they touch real data, from a human risk lens
Setting up safe experimentation environments with constrained permissions
Designing secure defaults that don’t collapse the moment something fails
Mapping real workflows to sensible automation boundaries, so “helpful” doesn’t become “uncontrolled”

If you want a practical second opinion before your team wires OpenClaw or other automation tools into messaging, cloud accounts, or internal systems, that’s exactly the kind of work we do.

Because the most credible warning I can give you isn’t “be careful”. It’s simpler, and it’s based on what I’ve just watched happen to others and what I’ve watched the system try to coax me into doing.

If you connect an agent like this to something you’d hate seeing on the front page, it will quickly test that fear for you.