What It Actually Costs to Run a Personal AI Agent Team
Everyone talks about AI agents. Nobody talks about the bill.
I run a team of 5 AI agents on a single Linux machine. They handle my daily briefings, draft tweets, scout for startup ideas, manage coding tasks, and monitor my infrastructure. They've been running for 54 days. I pulled the actual cost data from every session transcript.
Here's what it costs.
The Team
| Agent | Role | Model | Sessions | Cost |
|---|---|---|---|---|
| Forge | PM — manages coding agents, Linear tickets, PRs | Opus | 37 | $540.69 |
| Chill | Personal assistant — briefings, drafts, research | Opus | 50 | $55.87 |
| Radar | Market scanner — Reddit, HN, Play Store | Opus | 7 | $37.47 |
| Spark | Marketing strategy — validation, distribution | Opus | 12 | $24.31 |
| Sentinel | Daily improvement audits, security checks | Sonnet | 8 | $19.72 |
| 114 | $678.06 |
That's $12.56/day or roughly $377/month.
The Elephant in the Room: Forge
Forge is 79.7% of the total bill. One agent, $540.
Why? Forge runs Claude Opus and manages other coding agents — dispatching tasks, reviewing PRs, iterating on code. Every time it orchestrates a sub-agent, that's a new context window. 3,295 API calls over 37 sessions. It's doing the job of a junior PM who also reviews code, and it runs hot.
The other 4 agents combined cost $137.37. That's $2.54/day for a personal assistant, a market scanner, a marketing strategist, and a security auditor. Running 24/7.
Why Caching Changes Everything
Here's the number that matters most: cache hit rate.
| Agent | Cache Read | Cache Write | Cache Ratio |
|---|---|---|---|
| Chill | 207M tokens | 4.3M tokens | 91.3% read |
| Forge | 207M tokens | 67M tokens | 75.6% read |
| Radar | 29M tokens | 3.3M tokens | 89.6% read |
| Spark | 11M tokens | 2.7M tokens | 80.7% read |
| Sentinel | 15M tokens | 3.8M tokens | 80.2% read |
Anthropic charges $15/M tokens for Opus input. But cached input? $1.875/M — 87.5% cheaper.
My setup has long system prompts (workspace files, personality, tools documentation) that get cached across turns. Without caching, Chill alone would cost ~$3,114 for those 207M cached read tokens instead of ~$389. Caching saved roughly $2,725 on one agent.
Across all agents, over 309M tokens were cache reads. At full price that's $4,641. At cached price it's $580. Cache saved approximately $4,061 over 54 days.
What $12.56/Day Actually Gets You
Every day, automatically:
- ☀️ 8 AM: Weather, calendar, inbox triage, open tasks
- 🔍 9 AM: A new tweet draft, written from real analytics data
- 🔍 10 AM: Startup idea scout scanning Reddit, HN, and niche subreddits
- 💬 All day: Personal assistant on Telegram (me, asking random questions)
- 📊 9 PM: X/Twitter analytics review
- 🧠 10 PM: Blog topic suggestion based on recent work
- Plus: PR management, code review, security audits, market research on demand
That's a 5-person team running 24/7 for less than a Netflix family plan with an extra Disney+ subscription.
Would I Recommend This?
At $377/month — it depends.
If you're building something and can leverage the agents for actual work (like Forge managing PRs or Radar scanning markets), it pays for itself in time saved. Forge's $540 over 54 days managed dozens of coding tasks that would have cost me hours each.
If you just want a personal assistant, you could run Chill alone on Sonnet for ~$3/day. The Opus model is overkill for daily briefings and tweet drafts.
The biggest optimization available: move Forge to Sonnet. At Opus, Forge costs $10/day. On Sonnet, it would be roughly $1.50-2/day. That alone would drop the monthly bill from $377 to under $100.
The Numbers Don't Lie, But They Don't Tell Everything
What's missing from this analysis:
- Anthropic API credits vs actual billing — I'm on an API plan, so these are real costs against my balance
- Compute costs — The machine itself (a modest Linux box) is already paid for. No cloud compute fees.
- Opportunity cost — The ideas Forge managed, the posts Chill drafted, the markets Radar scanned. Hard to put a dollar value on "I didn't have to do it myself."
54 days in, I've spent $678 on a team that doesn't sleep, doesn't forget, and doesn't need meetings. Whether that's expensive depends entirely on what you do with it.
All numbers pulled from OpenClaw session transcripts. The extraction script reads every API call's usage metadata, multiplies by published Anthropic pricing, and groups by agent. No estimates — actual token counts and costs.