Local models are usually the wrong answer

The boring enterprise questions that actually matter

Vendor lock-in, legal review, and when local models actually make sense.

The boring enterprise questions that actually matter

Last issue I broke down the four things enterprises should evaluate when choosing AI coding tools: the model, the harness, the infrastructure, and the payment model — subscription, token-based, or per-request.

That last category is already shifting.

GitHub Copilot recently announced plan changes that remove most of the per-request subsidies, which makes the point about staying nimble more urgent than I expected when I wrote it.

This issue covers the three questions that slow enterprises down the most:

  1. How bad is vendor lock-in really?
  2. What does security and legal vetting actually involve?
  3. When do local models make sense?

Lock-in — less scary than you think, but not zero

It’s worth addressing directly: the risk is real, but it’s much smaller than most procurement teams fear.

The tooling itself is largely portable. A CLAUDE.md is a text file. Hooks and custom configurations take an afternoon to rewrite, and an agent can do most of it for you.

If you decide to move your team from Claude Code to Codex tomorrow, you’re not facing a months-long migration. You’re facing days of friction and 1-2 weeks of your developers recalibrating their prompting habits.

Where lock-in actually bites is in the contract, not the tooling.

A two-year enterprise agreement signed today is a bet that today’s best tool is still the right choice in 2027. Given how fast this space moves, that’s a significant bet.

Keep contracts short and exit clauses explicit — that’s the real mitigation, not worrying about whether your CLAUDE.md will port cleanly.

There’s a subtler version of the switching cost that’s worth understanding though.

Even if you stay with the same vendor, model updates can disrupt your workflow. When Anthropic shipped a new Opus version recently, many teams found it required a noticeably different prompting style to get the same quality of results — more explicit instructions, more courteous framing.

Same vendor, same harness, meaningfully different behaviour.

This isn’t unique to Anthropic — every major model update from any vendor carries some version of this friction.

GPT-5.5 came with extra goblins included. Really, it likes to talk about goblins a lot more.

So don’t just believe the marketing numbers that say a model is better on some benchmarks, but expect some friction until your devs get used to a new model.

Security and legal vetting — get the enterprise agreement, but be honest about what you’re protecting

A friend of mine recently told me about a developer at a large enterprise — locked-down corporate laptop, strict IT policy, no AI tools approved yet.

His workaround: taking photos of his screen with his phone and sending them to ChatGPT. This is what over-restrictive AI policies actually produce. Not compliance — creative non-compliance that’s far harder to monitor and control than just giving people proper access.

The most common pattern I see is a milder version of the same thing: developers have been using AI coding tools for six months before legal or IT finds out.

They signed up with a personal email, accepted consumer terms without reading them, and have been pasting internal code into a chat interface ever since. This is worth fixing, but not for the reasons most legal teams will tell you.

Consumer terms and enterprise agreements are meaningfully different. On a consumer plan, your prompts and the code you share may be used to improve the model. Data residency is typically unspecified and your request goes wherever there’s spare capacity. There’s no DPA, no contractual guarantees, no audit trail.

Enterprise agreements fix most of this: no training on your data, contractual data residency options, proper data processing agreements that actually hold up under GDPR scrutiny.

So yes, pay for the enterprise tier.

Don’t let developers run unsupervised on consumer accounts. That’s the straightforward advice. But be honest about what you’re actually protecting. Private equity firms have started doing technical due diligence differently — a standard web app that took a team six months to build can be reproduced over a weekend with agents.

Many acquisition deals have quietly fallen through because the technical moat turned out to be shallower than anyone admitted.

Code, increasingly, is not a valuable asset.

What is worth protecting is your data — customer information, internal architecture decisions, unreleased product details — and the domain knowledge embedded in your prompts and workflows.

Those go into the model context and that’s what an enterprise agreement actually safeguards.

The right frame isn’t:

“lock everything down to protect our IP.”

It’s:

“get the proper agreement so your data isn’t training someone else’s model, then get out of the way and let your developers use the tools.”

Enterprises that spend six months in legal review while competitors ship with agents aren’t protecting anything — they’re just falling behind.

Local models — usually the wrong answer to the right question

I’ve had a client running local models on four Mac Minis stacked in a server rack. It worked, technically.

It was available to two developers. Everyone else was still on API-based tools.

This is roughly the state of most enterprise local model deployments in practice: a proof of concept that never quite scaled, maintained by whoever set it up, running a model that’s two generations behind the frontier.

That’s not an argument against local models. It’s an argument for being honest about when they actually make sense.

The legitimate use cases are narrower than vendors selling local model infrastructure will tell you. Air-gapped environments — defence, critical infrastructure — have no choice.

Certain regulated industries where data genuinely cannot leave the building under any circumstances. Companies with IP sensitive enough that even a well-drafted enterprise agreement feels like too much trust. These are real, but they describe a minority of the enterprises currently evaluating local models.

The capability gap is closing but it’s not closed.

Models like Deepseek, Qwen and Codestral are genuinely impressive, and a year ago I wouldn’t have said that. But “genuinely impressive” and “as good as the frontier on a hard architectural problem” are still different things.

You’re trading capability for control, and that’s a legitimate trade — just make sure you’re making it consciously.

The hidden costs are also worth naming. Hardware is the obvious one.

Less obvious: you’re now responsible for updating the models yourself, running inference infrastructure, and absorbing the engineering overhead that a cloud API quietly handles for you.

Frontier model providers ship improvements constantly. On a local deployment you’re on your own cadence.

And there’s a longer-term risk that most local model strategies don’t account for. Open source models were released to build community and mindshare — and it worked spectacularly. Qwen alone accumulated nearly a billion downloads.

But once you have that mindshare, the incentive to keep releasing your best weights weakens.

Qwen’s flagship model is now API-only for the first time in the project’s history, while a capable but less powerful version remains open.

The frontier is closing, even among the labs that built their reputation on openness.

An enterprise strategy built around self-hosting the best available open model needs to reckon with the fact that “best available open model” may increasingly mean “second tier.”

For most enterprises the honest answer is this:

If you’re considering local models primarily because of data security concerns, an enterprise agreement with a proper DPA solves the same problem at a fraction of the cost and maintenance burden.

Local models are often a solution to a legal and procurement problem that has a cheaper answer. The exception worth considering is the hybrid approach: local for anything genuinely sensitive, API for everything else.

Classify your data, route accordingly. You get the security guarantees where they matter and frontier model capability everywhere else.

Before signing anything, three questions worth answering

Is your data actually sensitive enough to need a local model, or do you need a proper enterprise agreement?

Is this a two-year contract in a market that changes every six months?

Who internally owns the renewal decision — and do they know enough to make it?

This is also the kind of decision-making I help teams work through in AI-assisted development trainings: not just which tools to use, but how to adopt them without creating chaos.

Hit reply if you’re navigating any of this inside your team.

I read everything.

Leave a Reply

Your email address will not be published. Required fields are marked *