Back to the Lab
/ BUILD·FILED 18 MAY 2026·12 MIN·BUILD LOG
/ BUILD LOG  ·  PINNED

What a 2026 Website Actually Looks Like

The Lab is Claude-designed, Claude-coded, and fed by a swarm of routine agents wired into every public endpoint of the company. Here's what's under the hood — and what it costs to run.

What a 2026 Website Actually Looks Like
/ TL;DR

The Lab is Claude-designed, Claude-coded, and fed by a swarm of routine agents wired into every public endpoint of the company. Here's what's under the hood — and what it costs to run.

IThe Lab is Claude-designed, Claude-coded, and fed by a swarm of routine agents wired into every endpoint of my company. Here's what's under the hood.

The page you're reading right now wasn't typed.

I didn't open a doc on a Sunday afternoon and bang out 2,000 words. I didn't outline it in Notion and hand it to a writer. I didn't even prompt Claude in a chat window and copy the output back. A routine agent — running on a cron, reading from a Notion database, reading my own posts from the past seven days, reading the watchlist of 50 operators I track — wrote it. Another routine reviewed it. I flipped a Status field from pending-review to published. The site picked it up in under a second.

That's what a website looks like in 2026. Not the markup. Not the design. The fact that the entire content surface — essays, build notes, weekly digests, observations — is being produced by a swarm of routine agents that are connected to public-facing endpoints of my company. Slack, GitHub, X, LinkedIn, RSS, Notion. The agents read those, synthesise, and write structured rows into the system of record. I review. The site ships.

I run Black Matter VC — an AI systems studio. Solo practice. I also still run Junglebee, the SaaS booking platform I founded in 2014. No content team. No marketing person. No engineer on staff. This whole thing — the studio, the published voice, the build essays — is operated by one person and a lot of routines.

Here's how it actually works.


IIThe setup

The Lab is the public-facing layer. blackmatter.vc/lab. It's a Next.js site, hosted on Vercel, designed in conversation with Claude (the design system, the typography, the layout — Claude did the first pass, I edited, Claude shipped the implementation), and built by Claude Code. There is no other engineer. There was no design agency. The repo lives in GitHub. Branch protection on main. Every deploy is a git push.

The site reads from one Notion database — Lab Entries — at the edge. Title, Excerpt, Read time, Tags, Status, Body. The site filters for Status = published and revalidates within roughly one second of a flip. That's the whole frontend.

The interesting part is upstream. Lab Entries doesn't get written by me. It gets written by routines.

A routine, for the purpose of this essay, is a Claude Code agent running on a cron schedule inside Anthropic's managed routines runtime. It reads a prompt from a markdown file in the repo. It pulls data from one or more public-facing company endpoints via MCP — Notion, GitHub, X, LinkedIn, Slack, RSS, the file store, whatever. It writes structured output back into Notion. It logs the run. It exits.

Each routine does exactly one thing on one schedule. That's the constraint that makes the whole system debuggable.


IIIThe endpoints (this is the part most people miss)

Before any routine writes a single word, the company has to be wired up. The reason this whole thing works is that every system I use to produce public output is now agent-legible.

Here's what's connected:

My own social activity. A routine pulls my posts from X and LinkedIn every morning. It also refreshes engagement metrics on prior posts so I know what landed and what didn't. The posts that hit 3× baseline engagement get flagged as essay candidates.

The watchlist. Roughly 50 operators I track on X and a smaller LinkedIn list — fund GPs, AI builders, agent-infra voices. A routine reads them daily and writes the substantive ones into a Social Posts database. It also drafts up to two quote-post responses on high-relevance items and queues them for me to approve. A daily intel briefing lands in a Slack channel at noon.

The repos. A routine watches every public Black Matter repo for new releases. When something ships, it drafts a release note — one sentence on what shipped, one paragraph on why it matters, the PR link. Strictly public repos. The privacy model has to be airtight — a leaked private repo name on the public site is a relationship-ending event.

The Slack channels. A routine watches a couple of internal channels that I use as a public-ship log — when a Black Matter deployment goes live, when a new open-source MCP server I've built lands in a public registry, when a post of mine gets quoted somewhere I want to remember. It drafts release-style or observation-style entries with the Slack permalink as the source. The agent is scoped to those specific channels by design; it cannot see anything else in the workspace.

RSS. About 40 sources across AI infra, fund tooling, and agent thinking. A routine reads them daily, filters for relevance to what Black Matter actually does, and drafts observation entries that tie the piece back to a position the studio holds.

The published-work corpus. A routine reads everything that's already shipped on the Lab — every essay, every observation, every release note — and maintains a small vector index of "positions the studio has publicly taken". The weekly synthesiser uses this to make sure new long-form is building on what's already out there, not contradicting it.

That's five live endpoints feeding the system, plus the corpus index that keeps the voice consistent. All of them are public, or scoped to public-facing channels, by design. The system is wired to amplify what the studio publishes — never to ingest private work.

The endpoints aren't novel individually. Every company has Slack. Every company has GitHub. Every company has an RSS reader somewhere. What's novel is that all of them are now agent-legible at the same time, and you can wire one agent to all of them in a weekend.


IVThe agents — what each one does

There are about a dozen routines running right now. I'll describe the five that actually matter for the content surface.

The watchlist agent. Reads ~50 X profiles and a smaller LinkedIn list every morning. Writes the substantive posts into the Social Posts database, dedupes against the last seven days, drafts a couple of quote-post responses, posts a daily briefing to Slack. The first version of this agent was scraping X profile pages via HTTP. Cloudflare started rate-limiting Anthropic's egress IPs within 48 hours. I built a custom X MCP server in three hours, deployed it to Vercel, and the agent has been at 100% since. Cost: ~$25/month on the X API. Hosting: free.

The repo watcher. Pulls new commits and releases from public BM repos every day. Drafts release-style entries. Constrained to public repos by design.

The ship-log agent. Reads the two designated public-ship Slack channels for moments worth marking — a deployment going live, a new MCP server landing in a registry, an external mention worth quoting. Writes draft observation entries with the source permalink. The channel scope is hard-coded; the agent has no path to anything sensitive even if it tried.

The weekly synthesiser. Runs every Sunday at 06:00 UTC. Reads the trailing seven days of my own social posts plus the watchlist posts plus the trailing month of the Lab corpus. Finds a through-line — a position the week's conversation is converging on that I haven't yet written about. Writes one long-form essay, 1,500–2,500 words, in voice, as a Type: essay Lab Entry. This is the routine that wrote last week's essay on agent harnesses. It runs on the highest-tier Claude model — the quality bar is non-negotiable, and a weak essay is worse than no essay.

The editor. Runs late every day. Reads every entry that hit pending-review in the last 36 hours. Scores each on nine dimensions — voice match, depth, specifics over abstractions, repo privacy, closing line, style match against a corpus of reference essays. Three outcomes: green-check, rewrite in place, or send back to draft with a fix-it comment. The editor is the most important agent in the swarm. It's also the one I rewrote three times before it was good enough to ship.


VThe schema is the API

Most people read this and think the magic is in the prompts. The magic is in the schema.

Lab Entries has roughly 15 columns. Title. Excerpt. Read time. Tags. Type (essay / build / release / observation). Status (draft / pending-review / published). Agent author. Date. Slug. The frontend reads three of them. The agents write to all of them. The schema is what makes the agents legible to each other — the watchlist agent writes Social Posts rows that the weekly synthesiser reads in one query, the ship-log agent writes observation drafts that the editor can score against the same nine dimensions as everything else, the editor writes review scores back into Lab Entries so I can see at a glance which entries an agent already approved.

If the schema is right, the prompts get short. If the schema is wrong, the prompts get long and brittle, and the agents start tripping over each other.

I rewrote the schema once, three days in. The first version put raw social posts, half-finished release stubs, and ready-to-publish essays all in the same Lab Entries table. By day three the table was unusable — I was filtering it five different ways to find what I needed. I split it into three databases: Social Posts (raw social activity), Agent Runs (every run leaves a row), and Lab Entries (publish queue only). The agents that produce raw content don't touch the publish queue. The agents that synthesise read raw and write polished. The publish queue stays clean by design.

The rule I came out of that with: if you ever find yourself filtering a single table by Status to find the publish-ready subset, the table is doing two jobs. Split it.


VIEvery run leaves a row

This is the one architectural decision that turned the swarm from "twelve cron jobs I hope are running" into "a system I can operate".

Every agent writes to one Notion database — Agent Runs — before it exits. Started at, finished at, status (success / partial / failure), entries created, entries modified, notes on what happened. When something breaks, the row is the postmortem. When something runs quietly, the row is the proof.

The cost is one extra Notion write per run. The benefit is the difference between "the cron ran" and "the work got done". They're not the same thing.

When I open my laptop in the morning, the first thing I look at isn't a dashboard. It's the Agent Runs database, filtered to the last 24 hours, sorted by status. Anything red, I dig in. Anything yellow, I read the notes. Anything green, I trust. The whole system is operable from one Notion view.


VIIThe cost

Everything below is real, not hypothetical.

  • Claude Code routines runtime: ~$300/month for twelve agents running daily plus two weekly
  • Notion (the system of record): ~$30/month
  • The custom X MCP: ~$25/month on the X API
  • Vercel hosting for the MCP and the public site: $0/month on Hobby tier
  • GitHub: free

Total: under $400/month. The studio's entire content surface costs less than a single freelance writer would charge for one essay.

That number is the headline, but it's also the boring number. The interesting one is the time. Before the swarm, the Lab shipped maybe two pieces a month — when I had a weekend free, which is rare with two kids under four and a studio to run. After the swarm, something publish-ready hits pending-review every weekday. I'm the editor. I'm the operator. I'm not also the production line.


VIIIWhat broke and how it got fixed

Three things broke in the first week. Each one taught me something about the architecture.

The schema dilution problem. Day three. Lab Entries had become unsearchable — raw imports next to drafts next to publish-ready essays. The fix was the schema split described above. Lesson: one table per output type, not one table for everything.

The X scraping cliff. Day four. The watchlist agent was reading X profile pages via HTTP. Cloudflare started rate-limiting Anthropic's egress IPs. The agent worked maybe 70% of the time, which is worse than not working at all — you can't make decisions on a 70%-reliable signal. I built a custom X MCP that afternoon. Three hours of TypeScript. Six hours of fighting OAuth. Now it's 100%. Lesson: when the platform fights you, build your own connector. It's faster than it sounds.

The MCP scope discovery problem. Day five. The GitHub MCP I'd wired up turned out to be scoped to public repos only. The repo watcher kept returning 404s on private repo URLs, and the first runs almost shipped broken entries. The fix was a startup probe — every MCP-using agent now logs which scopes it can see at the start of the run and skips cleanly when access is missing. Lesson: scope discovery should be the first thing every agent does, not an assumption. (And in this specific case, public-only was the right scope anyway — the repo watcher should never see private code.)


IXThe honest verdict

The Lab is not magic. It's not set-and-forget. There are weeks when the editor agent flags four out of five entries for rewrite. There was a Saturday last month when the weekly synthesiser wrote a piece I genuinely disliked and I trashed it. The ship-log agent once tried to write an observation that quoted a Slack message I hadn't intended to make public — caught by the editor's privacy check on the way through, but not by the writer's own logic. I'm still tuning that agent.

The first version of the whole swarm was, in Michael-prompt language, super shit. I built a fat custom orchestrator, beautiful state machines, the whole thing. A month in I was sitting there thinking — what the hell did I just build? I rewrote it on a flight. The version running today is one-third of the code and ten times more useful, because the leverage is in the schema and the prompts, not in the runtime.

What's working: the cadence. Every weekday something hits pending-review. The weekly synthesiser is producing essays I would have written myself if I had three uninterrupted hours, which I never do. The watchlist agent is the single highest-signal input in my week — I open the noon briefing and I know what happened in agent-infra Twitter before I've looked at the timeline myself.

What's still hard: the editor. Nine scoring dimensions is the right number, but two of them (voice match and "specifics over abstractions") are fuzzy enough that I'm still re-tuning the rubric every month or so.

What's surprising: the audit trail has become a product feature, not just an ops feature. People I show the system to ask how I know an agent did its job. I show them the Agent Runs view. That trust is the difference between selling automation and selling AI infrastructure.


XThe privacy boundary, on purpose

One thing worth saying explicitly. Every endpoint feeding this swarm is either public (X, LinkedIn, public GitHub repos, RSS) or scoped to channels I have explicitly designated as public-ship surfaces (two named Slack channels). No agent in this system reads private repos, private DMs, or any conversation I have with another person. That is a design decision, not a limitation — the system is wired to amplify what the studio publishes, not to extract from what the studio does behind closed doors.

A different system, with different consent and different scopes, could do other things. This one doesn't. That boundary is part of what makes the architecture safe to operate as a solo practice.


XIWhat this looks like for a company that's not a one-person studio

The shape transfers, almost one-to-one.

Picture a company with a marketing team, a product team, a sales team, and an engineering team. The endpoints are the same shape: the company blog's RSS, the team's LinkedIn presence, public GitHub releases, designated public-ship Slack channels, the public corpus of everything the company has already published. The schema is different — Product Updates instead of Lab Entries, Competitor Signal instead of Social Posts, but the same row-per-run audit log. The agents do the same kinds of jobs: an agent that watches the 30 companies in your category and flags material updates, an agent that drafts release notes from public commits, an agent that synthesises the trailing week into one Monday-morning brief for the leadership channel.

Same architecture. Same cost surface (under $500/month). Same operational property: one person can run it, every agent leaves a row, every prompt lives in Git, every schema is portable.

Most teams I talk to don't have a content-quantity problem. They have a content-coherence problem. The blog says one thing, the LinkedIn posts say another, the founder's X says a third, the changelog is a year out of date. A swarm that reads everything the company has already published and writes new long-form building on that corpus — not contradicting it — is the difference between a company with a voice and a company with five disconnected voices.

That's the structural change in what one operator can do that wasn't possible 18 months ago. That's not a productivity improvement. That's a structural change in what one person can do.


XIIWhat I'd do differently

I'd build the schema before any of the agents. I tried to build them in parallel and it cost me three days.

I'd build the editor agent first, not last. The editor is the quality gate. Without it, every other agent's drafts pile up in pending-review and the human reviewer (me) becomes the bottleneck. The editor catches the obvious failures and frees the human review for taste-level calls.

I'd version the prompts in Git from day one. The first week I had prompts in Notion pages — easier to iterate on, harder to diff. By the end of the week I'd moved everything to markdown in the repo, where I can see what changed between runs and roll back when a tweak makes things worse.

I'd accept earlier that the routines runtime is rental. Anthropic might deprecate the managed routines product in twelve months. I designed for that — every prompt is portable, every schema is in Notion, every MCP is a standalone Vercel deployment — but I notice I was still emotionally attached to the runner for the first month. Don't be. Treat the harness as the cheapest part of the system. It is.


XIIIMore updates coming

The next thing I'm building is a second instance of this swarm running entirely inside a company's own infrastructure — same shape, different schema, different public endpoints, full audit trail. If you're reading this and the architecture above is doing the work you'd otherwise hire a content engineer or a platform team to do, that's the engagement.

I publish a build essay every Saturday and a weekly digest every Monday at blackmatter.vc/lab. The next build essay will probably be about the editor agent — what's in those nine scoring dimensions, why I rewrote it three times, what it still gets wrong.

If you want this for your company, email me. michael@blackmatter.vc. Flat retainer, three months, build + operate, no lock-in.

More updates coming.

Michael Rouveure  ·  18 MAY 2026

/ WORKING WITH BLACK MATTER VC

If this was useful,
you should book a call.

$10k / month. Whatever your fund needs, shipped that month. 30-min intro, no deck — I’ll tell you which three systems I’d ship first.

Or follow along on LinkedIn / X.