Related ToolsChatgpt

Under Oath, Musk Admits xAI Used OpenAI's Models to Train Grok

OpenAI
Image: OpenAI

Under oath, Elon Musk has acknowledged that xAI used OpenAI's models to help train Grok, its own competing AI assistant. His defense: that this is "standard practice" across the AI industry. That framing is worth examining carefully, because it's both defensible and deeply self-serving.

What "Distillation" Actually Means

Training an AI model requires large amounts of data. One shortcut is "distillation" - using the outputs of an existing capable model as training data for your own. Instead of labeling examples by hand, you run questions through a more powerful model, take the answers, and add those to your training set. Your model learns to produce similar outputs without needing the same compute budget.

OpenAI's terms of service explicitly prohibit this for competitive purposes, barring users from using API outputs to train models that compete with OpenAI. If xAI used the OpenAI API to generate Grok training data, that's a direct violation, regardless of how common the practice is elsewhere.

Musk's "everyone does it" argument has real grounding. Several prominent open-source models have been found to contain synthetic data generated by ChatGPT or Claude. Meta's early LLaMA fine-tunes were trained partly on ChatGPT conversations. The difference is that those were generally done by independent researchers, not a well-funded competitor generating commercial revenue from the result.

The Enforcement Problem

This admission didn't come in a vacuum. Musk is currently in legal conflict with OpenAI - a company he co-founded before departing, and has since sued, alleging it violated its nonprofit charter. The deposition is part of those ongoing proceedings.

The practical enforcement problem is significant. If you generate millions of training examples using a competitor's API and then delete the logs, they have no reliable way to prove it. Model weights don't contain a fingerprint that definitively reveals which data they were trained on. Terms of service restrictions on training data are essentially unenforceable without an admission - which is exactly what Musk has now provided.

Whether OpenAI pursues a counterclaim against xAI based on this is a business decision as much as a legal one. But the admission under oath matters. An industry that's been treating "don't train on competitor outputs" as an honor-system rule now has a high-profile case where someone violated it openly and called it normal.