Hidden Model Lineage Is Becoming an Enterprise AI Procurement Problem

Cursor launched a model. The internet looked inside.

Cursor shipped Composer 2 last week calling it "frontier-level coding intelligence." Within hours, an X user named Fynn had spotted something in the model's own code: it identified itself as Kimi. As in, Kimi K2.5 — the open-source model from Moonshot AI, a Chinese startup backed by Alibaba and HongShan.

Cursor VP Lee Robinson quickly acknowledged it: yes, Composer 2 started from an open-source base. According to TechCrunch, Robinson said only about a quarter of compute came from the Kimi base, with the rest from Cursor's own continued pretraining and RL training. Co-founder Aman Sanger was more direct: "It was a miss to not mention the Kimi base in our blog from the start."

Yes. It was.

The backlash wasn't about Kimi

Here's what matters to separate out: using an open-source base model is fine. It's normal. It's often the smart path for a company that isn't in the business of pretraining frontier models from scratch. Cursor is a coding tool company, not a lab. Fine-tuning a strong open model makes sense. The Decoder put this plainly — the problem is shipping someone else's base model under your own brand without saying so.

The problem here is the omission. "Frontier-level coding intelligence" is a specific claim. Enterprise buyers, security reviewers, and procurement teams evaluating AI tools ask specific questions when a vendor makes that claim. Questions like: what did you actually build? Where does the training data come from? Who trained the base? What are the licensing terms? Does this model carry any jurisdictional or export control sensitivities?

None of those questions could be answered from Cursor's announcement — because the announcement didn't mention the dependency at all. The story broke on social media, not in the product documentation. That's the disclosure failure.

HN captured the developer reaction fast: the thread "Cursor Composer 2 is just Kimi K2.5 with RL" hit 276 points and 166 comments. VentureBeat elevated it into a broader Western AI open-source dependency and disclosure question. That's not mass-market viral — it's the audience that feeds enterprise procurement decisions.

There's a new layer of the AI stack nobody's interrogating yet

Modern AI products are stacks. Not single models. There's a base model (open or proprietary), a fine-tuning layer, evaluation layers, product wrappers, hosting infrastructure, inference partners (Fireworks AI, in Cursor's case), and connector layers.

Until recently, enterprise scrutiny stopped at outputs: is it accurate, does it hallucinate, can we see what it retrieved? That question set is expanding.

The questions landing in procurement and security reviews look more like this now:

What base model is this product built on?
Who trained it, and under what license?
Can this be commercially deployed in our jurisdiction?
What upstream dependencies exist, and are they documented?
When the vendor says "our model" — what portion is actually theirs?

These aren't hypothetical concerns from paranoid legal teams. They're the same class of question enterprises already ask about software dependencies. The difference: software teams have had SBOMs — Software Bills of Materials — for years. AI doesn't have that yet. It's starting to need it.

The next AI trust crisis may not be hallucination. It may be hidden supply chains.

What enterprises already do — and where AI lags

Software supply chain transparency became a serious compliance concern after SolarWinds in 2020. The subsequent push for SBOMs in federal procurement made dependency disclosure standard practice for any vendor selling into regulated environments. Today, serious software vendors don't ship without some version of that documentation.

AI procurement is behind. Buyers are still mostly evaluating models on benchmark performance and demo quality. But as courts have started demanding provenance for AI outputs and enterprise risk teams get more sophisticated, the scrutiny is moving up-stack. It won't stop at "what did the model say" — it'll extend to "where did the model come from."

The Cursor situation is a preview of what happens when that question gets asked and the answer wasn't in the launch post.

Where Mojar sits in this

Mojar isn't a model company. We build the knowledge infrastructure that enterprise AI runs on — the retrieval layer, the document management system, the source attribution chain. That's a different part of the stack.

But the underlying principle is the same one driving the Cursor backlash: enterprise AI trust is moving toward full-stack provenance. Buyers want to know what their AI read. They want to know where the sources came from, whether those sources are current, and whether anything has changed since. Guardrails alone aren't enough — enterprises need to prove what their AI actually saw.

Governed, source-attributed RAG provides that at the knowledge layer. Every answer cites the document it came from. Every document is maintained, versioned, audited. The evidence chain is visible — not inferred.

If enterprises are going to demand provenance from model vendors, they should demand the same from the knowledge systems those models query. The two requirements are the same idea at different points in the stack.

What to watch

Expect procurement questionnaires to start including model-lineage disclosure requirements. Expect AI vendors to start publishing foundation model acknowledgments proactively — Cursor's co-founder already said they'll do this for the next release. Watch for SBOM-adjacent norms emerging for AI: base model, fine-tuning methodology, inference partner, training data provenance.

The developers and security engineers who drove the Cursor conversation this week are the same people who write enterprise procurement checklists next year. The wave moves slow enough to prepare for. Not so slow you can ignore it.

Cursor launched a model. The internet looked inside.

Yes. It was.

The backlash wasn't about Kimi

There's a new layer of the AI stack nobody's interrogating yet

Until recently, enterprise scrutiny stopped at outputs: is it accurate, does it hallucinate, can we see what it retrieved? That question set is expanding.

The questions landing in procurement and security reviews look more like this now:

What base model is this product built on?
Who trained it, and under what license?
Can this be commercially deployed in our jurisdiction?
What upstream dependencies exist, and are they documented?
When the vendor says "our model" — what portion is actually theirs?

The next AI trust crisis may not be hallucination. It may be hidden supply chains.

What enterprises already do — and where AI lags

The Cursor situation is a preview of what happens when that question gets asked and the answer wasn't in the launch post.

Hidden Model Lineage Is Becoming an Enterprise AI Procurement Problem

Cursor launched a model. The internet looked inside.

The backlash wasn't about Kimi

There's a new layer of the AI stack nobody's interrogating yet

What enterprises already do — and where AI lags

Where Mojar sits in this

What to watch

Related Resources

Hidden Model Lineage Is Becoming an Enterprise AI Procurement Problem

Cursor launched a model. The internet looked inside.

The backlash wasn't about Kimi

There's a new layer of the AI stack nobody's interrogating yet

What enterprises already do — and where AI lags

Where Mojar sits in this

What to watch

Related Resources