Back to the Lab
/ ESSAY·FILED 18 MAY 2026·14 MIN·LONG-FORM
/ LONG-FORM  ·  PINNED

Structuring Your Company For AI In 2026

The pillars that don't move, the ones that do, and how to tell them apart. Five owned layers, two rental layers, and the architecture that survives the next eighteen months of model and agent churn.

Structuring Your Company For AI In 2026
/ TL;DR

The pillars that don't move, the ones that do, and how to tell them apart. Five owned layers, two rental layers, and the architecture that survives the next eighteen months of model and agent churn.

IThe pillars that don't move, the ones that do, and how to tell them apart

By Michael Rouveure — Black Matter VC

18 May 2026


Most teams I sit down with in 2026 are in the same position. The project queue has thirty active items. Half of them depend on a model that got reshuffled by the last vendor release. A third are blocked on a tool somebody set up eighteen months ago that nobody now wants to admit they don't understand. The AI bill has tripled and nobody can tell you which line item produced which outcome.

The sentence I keep landing on, in some form or another, is this: it's time to stop running experiments and start running a roadmap.

That's the whole problem statement for any company serious about AI in 2026. Most teams — funds, SaaS companies, agencies, professional services firms, small founder-led businesses — got into AI through experiments. A Zapier scenario here. An enterprise model seat there. A no-code app that one analyst built and nobody else uses. An enrichment pipeline that yields five email replies per conference for ten hours of human time. Eighteen months of this and you have a tooling graveyard, no idea what's load-bearing, and an AI bill nobody can attribute to a decision.

The companies doing this well in 2026 aren't the ones running the most experiments. They're the ones who figured out which layers are permanent, which ones are rental, and how to stop confusing the two.

That distinction is the whole essay.


IIWhat changed since you last looked

Three things changed in the last twelve months that broke most of the AI architectures companies built in 2024 and 2025.

The model layer became disposable. A year ago, signing an enterprise contract with one of the big model vendors was a strategic decision. Today it's a billing decision. Almost every serious operator I talk to is mid-migration from one frontier model to another right now — the team that planted its flag on a single vendor in 2024 spent 2026 re-platforming. The team that didn't, didn't. There's a now-canonical post going around about someone who spent a year building scaffolding for one model's harness and watched a single update obsolete the whole stack. That's the 2026 mistake in one sentence.

The agent layer also became disposable. Six months ago "which agent platform" was a serious question — a half-dozen open-source frameworks and a handful of proprietary GUIs competing for the orchestrator slot. Today every vendor of the model also ships the harness, the runner, the orchestrator, and the connector library. The harness you spent a quarter wiring up isn't a moat. It's a polished cage you're locked inside the moment the model vendor ships the same thing natively, which is roughly every four months now.

The data and permissions layer didn't change at all. Your document store is still your document store. Your CRM is still your CRM. Your seven years of customer records, project notes, contracts, and internal memos are still in the same eight places they were in 2023. The sensitivity labels, the delegate-access rules, the "this person has access to the finance folder but the agent shouldn't" problem — none of that got easier. If anything it got harder, because every agent you connect to anything makes the permissions question urgent in a way it never was when it was just humans reading the files.

The companies doing this well internalised the asymmetry. The layers above the data — model, harness, framework, UI — are rental. The layers at and below the data — context, schema, permissions, skills, verification loops — are owned. Spend accordingly.


IIIThe seven pillars

There are seven things you have to make a decision about. Five of them barely move year-to-year. Two of them you should expect to swap every twelve to eighteen months. Here is the map.

FIG · 01

aPillar 1 — Code, repos, and the work surface around them

Don't overthink this one. The default is GitHub. Private org. Every team I have worked with this year landed there regardless of where they started. The reason isn't tribal — it's that the agentic coding tools that defined 2025 and 2026 all assume GitHub as the substrate. The day you decide a customer-facing dashboard or an internal tool needs three more fields, the path of least resistance is "ask the coding agent to open a PR against the repo". That path only exists if the repo is in GitHub.

GitLab is the credible alternative — if your company is already standardised on it (regulated industries, self-hosted preferences, EU data residency requirements), stay. The agentic tooling around GitLab is six months behind GitHub on average, but the gap is closing. Don't migrate off GitLab just because GitHub has more agent integrations this quarter. Do migrate off Bitbucket, Azure DevOps, or any self-hosted SVN that hasn't been touched since 2019 — those are where the agent-tool gap is permanent.

The work surface around the repo matters almost as much as the repo itself. Linear is the default for issue tracking, sprint planning, and the layer where humans and agents both write tickets. The reason is the same as for GitHub — the agentic tools assume it. Coding agents open Linear tickets, attach PRs to them, and close them. If your team is on Jira, Asana, or ClickUp, the friction shows up in every agent workflow. Linear isn't strictly required (some teams do fine in GitHub Issues directly), but pick one, wire it to the repo, and stop layering project-management tools on top of project-management tools.

Cost: GitHub free to ~$4/user/month for the team plan; GitLab free to ~$29/user/month for premium; Linear ~$8–14/user/month. The thing that's expensive isn't the seats — it's the absence of a repo or a clean ticket queue when you need one. Half the teams I work with already have people who set up their own personal GitHub accounts because the coding tools keep prompting them to. Make it official. Single org, branch protection on main, secrets pulled from a single source.

Why this is durable: GitHub has been the substrate for ten years and the AI-coding tools that landed in 2025–26 doubled down on it rather than building around it. GitLab will continue to coexist for the segments of the market where its model fits. Linear has eaten the issue-tracker category at AI-native companies and shows no sign of slowing. Pick the right one for your context, commit, move on.

bPillar 2 — Hosting on Vercel (or Railway)

Both are fine. Pick one and stop thinking about it.

Vercel if the dominant workload is web apps, serverless API endpoints, MCP servers, dashboards — anything that's a Next.js app, a small TypeScript serverless function, or a thin HTTP wrapper around an API. Vercel's pricing makes this almost free at fund scale.

Railway if the dominant workload is long-running processes, background workers, scheduled jobs that don't fit a short serverless window, or stateful services like a database you want to operate yourself.

In practice most companies end up with both. Vercel for the things users open in a browser, Railway (or a managed routines runtime from your model vendor, or a Mac mini in a cupboard for genuinely sensitive workloads) for the things that run on a cron at 2am.

What you don't want: cloud-console-clicked infrastructure with no version control. The minute a company gets to "we have a service running somewhere but nobody knows the password to the bastion", you're in a hole. Vercel and Railway both deploy from a git push. That's the operating constraint — if the answer to "how do I redeploy this" isn't a git push, the platform is the wrong one.

Cost: ~$20–40/month per project at small-company scale. Order-of-magnitude cheaper than the engineer-time you'd burn operating your own infra.

Why this is durable: serverless-from-git is the dominant pattern. The competition between the major platforms is real, but they all converge on the same shape. You can move between them in a weekend if a price changes. Don't sweat the choice.

cPillar 3 — Data in Supabase

This is the one most companies get wrong, and the one that pays back the most over the next three years.

Supabase is Postgres-as-a-service with auth, storage, row-level security, vector embeddings, and a real-time API on top. Translation: it's the database, the auth layer, the file store, the vector DB for semantic search, and the API — one product, one bill, one schema.

Why this matters: every AI thing you'll want to build is a query against your own data. An account brief queries the CRM and the document store. A customer digest queries the support history and the product analytics. A diligence or RFP chatbot does vector search across the data room. A support agent reads the ticket history. Every single one of those reduces to "give a language model controlled access to a Postgres database with some embeddings".

You don't want eight different stores for eight different use cases — that's the architecture that produces the enrichment pipelines we routinely rip out: a staging tool, a data-enrichment tool, the CRM itself, a separate enrichment-of-the-enrichment, plus a manual dedupe step that breaks every export. Each tool in isolation is fine. Eight of them is a tooling graveyard.

Supabase consolidates that into one schema with row-level security. The agent connects to one place. The permissions are at the row level, not at the tool level. When (not if) the agent layer changes next year, the data layer doesn't.

Cost: $25/month Pro tier covers most company use cases for a long time.

Why this is durable: Postgres is forty years old and shows no sign of going anywhere. The thin wrapper Supabase provides is replaceable in a week if it ever needs to be. The schema you design is what's actually valuable, and the schema is portable to vanilla Postgres on day one.

A note on what doesn't go here: anything your existing system of record already owns. Your customers and deals live in your CRM. Your internal notes live in your notes tool. Your documents live in your document store. Supabase is for the data your company produces that doesn't have an obvious home — derived signals, agent run logs, vector embeddings of your corpus, application state for the dashboards you build. Don't migrate the CRM into Supabase. Connect them.

dPillar 4 — The AI model layer is rental

Here is the rule that holds across every stack I see: assume the model vendor will change inside twelve months and architect for it.

In practice this means three things.

One contract, one default, plus a fallback. As of mid-2026 the default at most companies is whichever frontier model is leading on coding and reasoning that quarter. It wasn't a year ago. It might not be in twelve months. The adoption curves are wild — teams go from full-vendor-A to half-vendor-A-half-vendor-B to full-vendor-B inside two quarters, with no central decision behind it, just people opening the tool that works better that month. The contract isn't the strategic decision — the contract follows the usage. Update billing every quarter.

Don't bake the model into the application. Every API call goes through one wrapper that knows how to talk to multiple vendors. Same prompt, different vendor, different model. Three lines of config. If you can't change the vendor with a config flip, your stack is wrong.

Skills and prompts are version-controlled in GitHub, not in the vendor's console. This is the single biggest mistake I see. Teams build custom instructions inside the vendor's UI, then six months later they're locked in by the operational cost of recreating it elsewhere. Every prompt, every skill, every persona belongs in a markdown file in a Git repo. When you change vendors, you copy markdown.

The pattern that keeps reappearing as the right shape for a company's internal AI surface is this: one enterprise account with your chosen model vendor for the people layer (everyone gets a seat, they use it as their default LLM, they install team-wide skills), plus a thin custom application for the agent layer (one or two serverless MCP servers that wrap your sensitive data sources with the correct permissions, called by the model). If the vendor ships a native connector that handles the permissions correctly — which they will, eventually — the wrapper goes away in a weekend. If they don't, or if the permissions semantics are wrong (and right now, for most enterprise document stores, they are wrong — full delegate access leaks across scopes), the custom layer stays. The optionality is in the architecture, not in the contract.

Cost: team plans are ~$30/user/month. Enterprise is meaningfully more, and worth it specifically for SAML SSO and tighter document-store permissions. The model API spend on top of seats varies wildly with usage — budget $100–500/user/month for active users, plus whatever your agent routines burn (a typical agent setup comes in at $200–500/month of model-routine spend).

Why this is rental: the competitive dynamics are vicious. Every vendor is six months behind the leader and twelve months ahead of where the leader was a year ago. Pick a default, swap freely, don't get sentimental.

ePillar 5 — The agent / harness layer is also rental

The temptation in 2026 is to pick a "platform" — one of the open-source orchestration frameworks, a workflow automation tool, an in-house agent harness — and treat it as a strategic choice. Don't.

The harness should be the smallest, dumbest possible thing. Read prompt from a markdown file. Pull source data from your system of record. Write back to that same system. Log the run. Exit.

The agent swarm we run inside our own studio — a dozen routines that produce most of the content surface across our published channels — runs on exactly this. Managed routines runtime from the model vendor. Cron schedules. MCP connectors for the systems of record. Failure alerts to a single Slack channel. That's it. No orchestration framework. No state machines. No multi-agent message bus. If tomorrow the vendor deprecates the routines runtime, ninety percent of the value migrates with a weekend of work because the leverage was never in the runtime — it was in the prompts (in Git), the schema (in Notion), the skills (in markdown), and the audit log (in a Notion DB).

The same principle applies at the company scale. The question that comes up most often is some version of "should we build a [diligence / sales / support / research] platform?" The answer is no. You should build a thin agent — one MCP server that knows how to query your existing system of record with the right permissions, plus a skill that knows what kinds of questions your people actually ask. The platform is the model vendor. The agent is the connector and the skill. Total surface area: a couple hundred lines of TypeScript, one Vercel deployment, one Supabase table for the run log. Operating cost ~$50/month. Replaceable in a week if anything in the stack changes.

The "agent platform" you don't build is the most valuable architectural decision you'll make this year.

fPillar 6 — Secrets, identity, and permissions

This is the boring one that decides whether you can keep going.

Secrets management: one place. Vercel environment variables for things deployed on Vercel. GitHub Secrets for CI. A password manager (1Password, Bitwarden) for human-held credentials. Don't paste API keys into your notes app. Don't email them. Don't put them in a model vendor's "project instructions" field. The minute secrets sprawl, you have an exfiltration risk that no model-level guardrail will save you from.

Identity: one IDP. Google Workspace for most companies under 50 people; Microsoft Entra (formerly Azure AD) for most companies above that. SAML SSO for everything that supports it. The reason you upgrade to the enterprise tier of your model vendor over the team tier isn't the model — it's SAML. Same for Vercel, GitHub, Supabase. When someone leaves, you turn off one account and everything turns off. If you can't do that, you're one disgruntled exit away from a data incident.

Permissions: at the data layer, not the tool layer. This is the issue that keeps coming up. Off-the-shelf MCPs for major document stores give the agent delegate access — the agent can read everything the user who connected it can read. For a senior person who has access to half a dozen folders including sensitive material (HR records, financials, customer contracts, board material), that's an immediate problem: the agent inherits scopes it has no business touching. The "fix" at the tool layer is to install separate MCPs per scope, which works until somebody connects the wrong one. The fix at the data layer is row-level / folder-level permissions enforced on the source, with the agent getting only what it's entitled to. Supabase RLS handles this for data you own. For external document stores, the fix is a custom MCP that maps the agent's identity (not the connecting user's) to a specific scope. This is most of the engineering work in any serious agent build.

The general rule: if the agent's access is "whatever the human can see", you have a permissions architecture by accident, not by design. Fix it.

gPillar 7 — Observability and the run log

Every agent leaves a row. Every run. Every time.

This is the single architectural decision that turns "a bunch of cron jobs I hope are running" into "a system I can operate". One database table — call it agent_runs — with one row per execution. Started at, finished at, status (success / partial / failure), what it found, what it created, what it skipped, why. When something breaks, the row is the postmortem. When something runs quietly, the row is the proof.

For most companies this lives in Supabase, alongside the rest of the derived data. Every query the agent answers gets a row. Every sensitive document it reads gets logged. Every refusal (the permissions check that prevents an out-of-scope leak) gets a row. The audit trail is a feature, not an afterthought — it's what makes the system defensible to your board, your customers, your auditor, your regulators, and yourself when something goes wrong.

The cost is one extra write per agent invocation. The benefit is the difference between "the cron ran" and "the work got done", and they are not the same thing.


IVHow the pillars connect — the worked example

Here's what a company's stack looks like at the end of a 90-day engagement when we do this properly, mapped onto the seven pillars:

Notice the asymmetry. The five owned pillars — schema, secrets, identity, observability, repo and ticket workflows — should not change in five years unless something genuinely structural happens. The two rental pillars — model and agent harness — should be expected to change inside twelve months, and the architecture should make that swap a config change, not a rewrite.

That ratio is the whole game. If you find yourself in a stack where the "owned" pillars are also vendor-coupled — your schema lives inside a no-code app, your audit log lives inside the agent vendor's dashboard, your permissions are defined in the model vendor's project console — you have inverted the asymmetry and you will pay for it.

The workflow for a single feature now looks like this. Someone on the team asks "can we get a one-pager on this account by Friday?" — substitute customer, supplier, candidate, portfolio company, prospect, whatever your business actually deals with. Someone on the team (or the requester themself, using the model vendor's coding agent) writes a prompt as a markdown file in the GitHub repo. The prompt is wired into a routine that queries the CRM via MCP, pulls news via a Supabase table, pulls related documents from the document store via the custom MCP (which respects permissions), and writes the brief to your notes tool. The run gets logged in agent_runs. The requester reviews, edits, ships. Total elapsed time: 90 seconds.

Six months from now the model vendor might ship a routines product that's twice as good. You swap the runtime. Twelve months from now a different vendor might ship a model that's twice as fast at a tenth of the price. You swap the API client. The prompt, the schema, the MCP servers, the permissions, the audit log — none of that moves. The user experience doesn't change. The bill goes down.

That's what "structured for AI in 2026" means in practice.


VWhat this costs

A complete, realistic budget for a company of 15–20 people running this stack:

  • GitHub Team: ~$4/user/month — ~$80/month
  • Linear: ~$8–14/user/month — ~$200/month
  • Vercel Pro: $20/month per project — $40–100/month
  • Supabase Pro: $25/month
  • Enterprise model account: ~$60/user/month + usage — ~$1,200/month + variable API
  • Managed routines runtime: $200–500/month depending on agent volume
  • 1Password Business: $8/user/month — ~$160/month
  • Google Workspace or Microsoft 365 (you already have this): existing
  • Notes tool, CRM, document store, communications (existing): existing

Total new spend if you have none of this: ~$2,000–2,700/month. Compare to one full-time platform engineer (~$15–20k/month all-in) and the math is straightforward. The point of the stack isn't to replace the engineer — it's to make one engineer (or one consultant, or one operationally-inclined founder or operator) productive at the level a five-person platform team was in 2022.

The hidden cost is the discipline of not sprawling beyond this list. Every new tool that doesn't fit one of the seven pillars is a future migration project. The strongest signal that a company is doing this well isn't the tools they added — it's the ones they cancelled. The default move in the first month of any consolidation is killing two to four subscriptions that overlap with the pillars. Not for the savings — for the clarity.


VIThe migration playbook (90 days)

Most companies already have most of this somewhere, in pieces, in the wrong places. The work isn't building from scratch — it's consolidating. Here's the sequence that works.

Days 1–14 — audit and cancel. List every AI-adjacent tool. Cost, owner, last-time-used, what-it-does. Anything that hasn't been opened in 30 days, cancel. Anything that overlaps with another tool on the list, pick the winner and cancel the other. Anything in the "experiment" bucket that hasn't been promoted, cancel. Most teams cancel three or four subscriptions in this first pass. The point isn't the savings — it's clearing the field so the remaining tools are unambiguous.

Days 15–30 — base infrastructure. Single GitHub (or GitLab) org. Single Linear workspace. Single Vercel team. Single Supabase project. Identity hooked to your IDP via SAML where possible. Secrets moved into 1Password. Run a git push deploy of something — even a one-page status dashboard — to prove the path works.

Days 31–60 — connect the systems of record via MCP. Custom MCP servers for any tool that doesn't already have a clean one (or whose default one has the wrong permissions semantics). Your CRM, your document store, your notes tool, your support system, your marketing or outreach platform — whichever ones you actually use. Each MCP is a couple hundred lines of TypeScript, deployed on Vercel, with the permissions defined explicitly. This is where most of the work is. It's the only step in the playbook that genuinely needs an engineer.

Days 61–90 — first three routines. Pick three high-leverage recurring tasks. Common starting points: a nightly sync between two systems that should be one; an account brief generated from the CRM + the document store + the news feed; a weekly digest from public signal on accounts, portfolio companies, customers, or competitors you care about. Each routine is one prompt, one cron, one write target, one row in agent_runs. Resist the urge to build a fourth until the first three have run for two weeks without intervention.

By day 90 you have a stack that's roughly the right shape. From there, every new use case is "add another routine" — not "evaluate a new platform". That's the architectural property you're paying for.


VIIWhat I'd say to a company considering this

If you're a founder, CEO, or COO reading this and the temptation is to delegate the whole thing to a platform team or a consultancy and stop thinking about it — don't. The reason this works at smaller companies (5–100 people) is precisely because one or two operationally-inclined humans hold the entire architecture in their heads. The minute you scale the platform team to five people, the architecture starts to drift toward the complexity that justifies five people.

Stay small. Pick the pillars. Cancel the things that don't fit. Treat the model and the agent harness as rental. Treat the schema, the permissions, the audit log, and the skills as owned. Re-check the rental layers every six months. Never re-check the owned layers more than once a year — they shouldn't need it.

That's what surviving the next eighteen months looks like. The teams that did this in 2024 watched themselves migrate from one frontier model to another this spring with one config change. The ones that didn't are mid-rewrite right now. There will be another migration in twelve months. There will be another in twenty-four. The architecture either absorbs them, or it doesn't.

The pillars are the part that doesn't move. Build there.


Black Matter VC builds and operates this exact stack for companies and funds that are serious about AI. Three months per engagement. Flat retainer, build + operate, no lock-in. — michael@blackmatter.vc · blackmatter.vc/lab

Michael Rouveure  ·  18 MAY 2026

/ WORKING WITH BLACK MATTER VC

If this was useful,
you should book a call.

$10k / month. Whatever your fund needs, shipped that month. 30-min intro, no deck — I’ll tell you which three systems I’d ship first.

Or follow along on LinkedIn / X.