OpenAI and Anthropic Accuse China of AI Model Theft, White House Evidence Unclear

At a Glance
  • OpenAI and Anthropic have publicly accused Chinese AI labs of industrial-scale campaigns to extract capabilities from U.S. frontier models through unauthorized distillation
  • Anthropic identified 24,000 fraudulent accounts and 16 million Claude exchanges linked to DeepSeek, Moonshot, and MiniMax
  • The White House has reportedly adopted similar “industrial-scale theft” language, but the most detailed public evidence comes from company disclosures rather than independently released government forensics

OpenAI and Anthropic have accused Chinese AI labs of industrial-scale campaigns to extract capabilities from U.S. frontier models.

OpenAI reported observing DeepSeek-linked efforts to evade access controls and harvest outputs from American frontier models. Anthropic identified three Chinese labs using about 24,000 fraudulent accounts and more than 16 million Claude interactions in what it calls industrial-scale extraction campaigns.

The White House reportedly adopted similar language. The most detailed public evidence so far comes from the companies themselves, but the administration’s published policy response is unambiguous.

The dispute turns on a technical distinction. Distillation legitimately produces smaller, more efficient models when done on one’s own outputs or with permission.

OpenAI and Anthropic say the issue is not distillation itself but unauthorized extraction — repeatedly querying frontier models at scale through fraudulent accounts and proxy networks, then using the outputs to train competing models. Both companies frame this as circumvention of access controls, not licensed training data reuse.

The Company Cases

OpenAI’s most concrete allegations appear in a February memo to the House Select Committee. The company says “the majority of adversarial distillation activity” it observed appears to originate from China. OpenAI observed “accounts associated with DeepSeek employees” developing methods to circumvent access restrictions via “obfuscated third-party routers and other ways that mask their source.”

Programming workstation with code on multiple monitors
Programmers work at computer workstations amid allegations that Chinese AI companies used sophisticated code to secretly access American models. · Photo by Compagnons on Unsplash

The memo describes DeepSeek employees developing code to access U.S. AI models and collect outputs “for distillation in programmatic ways.” OpenAI says Chinese actors have moved beyond simple chain-of-thought extraction to “more sophisticated, multi-stage pipelines” mixing synthetic-data generation, cleaning, and reinforcement-style preference optimization.

The OpenAI memo describes a pattern closer to data mining than model copying. It alleges Chinese labs built pipelines to extract reasoning patterns, multi-turn conversations, and preference signals, not just single responses. The memo repeatedly uses qualifying language — “indicative,” “consistent with,” “we believe” — and does not attach account IDs, network logs, or other forensic artifacts to the public document.

Anthropic provided more specific numbers. The company said it identified “industrial-scale campaigns” by DeepSeek, Moonshot, and MiniMax.

Over 16 million exchanges with Claude moved through approximately 24,000 fraudulent accounts. One proxy network managed more than 20,000 fraudulent accounts simultaneously.

Anthropic made the attribution “with high confidence” using IP correlation, request metadata, infrastructure indicators, and corroboration from industry partners. The company described proxy-based “hydra cluster” architectures used to resell access to Claude and other frontier models.

The alleged incentive structure is straightforward. Licensing 16 million Claude interactions through standard enterprise pricing would run into the millions of dollars.

Operating a fraudulent-account proxy network, if the Anthropic numbers hold, would cost a fraction of that. Anthropic frames this gap as the economic driver behind the campaigns it describes.

The White House Frame

Reuters reported on April 23 that the Financial Times cited a memo by Michael Kratsios, director of the White House Office of Science and Technology Policy, accusing China of “industrial-scale theft” of AI labs’ intellectual property. Reuters noted it could not independently verify the report, and the Kratsios memo has not been released as a primary document.

White House exterior showing government headquarters
The White House, where officials have reportedly escalated accusations against China regarding systematic theft of AI technology. · Photo by Ari Gardinier on Unsplash

What the White House has published is more specific on the policy response than on the theft claim itself. The July 2025 AI Action Plan does not mention DeepSeek, distillation, or “industrial-scale theft.”

It does commit the administration to preventing “our adversaries from free-riding on our innovation and investment” — the same phrase OpenAI uses in its House Select Committee memo to describe Chinese distillation activity.

The plan directs the Commerce Department to strengthen AI compute export control enforcement, explore location verification on advanced AI chips to keep them out of “countries of concern,” and develop new export controls on semiconductor manufacturing sub-systems currently outside existing restrictions.

It creates evaluation authority through the Center for AI Standards and Innovation (CAISI) at Commerce to assess “potential security vulnerabilities and malign foreign influence arising from the use of adversaries’ AI systems,” including possible backdoors. It also directs State and Commerce to counter “Chinese companies attempting to shape standards for facial recognition and surveillance” in international governance bodies.

What is absent from the plan is the sharper accusation — the industrial-scale theft framing. That language appears only in the reported Kratsios memo.

The Evidence Gap

OpenAI’s memo repeatedly uses qualifying words like “indicative,” “consistent with,” and “we believe.” It is a serious allegation, but not a public evidentiary dump with logs, account IDs, or network forensics attached.

Anthropic’s post is the most concrete public document in the set but still relies on company assertions.

This creates a credibility asymmetry. Companies have detailed logs and commercial incentives to overstate threats. Government agencies have investigative authority but political incentives to amplify geopolitical tensions.

Neither has released the kind of technical evidence that would allow independent verification of the scale claims.

What China Says It’s Doing

DeepSeek has not disputed the technical practice of distillation. Its R1 release page openly promotes distilled models: “Distill & commercialize freely!” The page says “API outputs can now be used for fine-tuning & distillation.”

This does not prove DeepSeek improperly distilled U.S. models. DeepSeek publicly embraces distillation as a method and product category.

The dispute centers on whether Chinese labs properly licensed the frontier model outputs they used for training or evaded access controls to harvest them without authorization.

The companies describe networks of unauthorized resellers and sophisticated proxy architectures designed to mask the source of requests. OpenAI says Chinese companies rely on “networks of unauthorized resellers” of OpenAI services to evade platform controls.

The DeepSeek R1 repository documents the distilled models the company openly ships. That public posture does not, on its own, confirm or refute the unauthorized-extraction allegations about DeepSeek’s earlier training runs.

The most detailed public evidence currently comes from company disclosures rather than independently released government forensics. The White House’s “industrial-scale theft” framing echoes those company claims, but the administration’s published playbook — export controls, chip location verification, and formal evaluations of adversary AI systems — is already well underway.