US Launches Pre-Release AI Security Testing Program

AI Security Testing Program Announced By US Government

A quiet but significant shift in how the United States governs frontier artificial intelligence took effect on Tuesday, 5 May 2026. The Center for AI Standards and Innovation (CAISI) — the Commerce Department body that succeeded the Biden-era US AI Safety Institute — announced formal agreements with Google DeepMind, Microsoft, and xAI to evaluate their most advanced AI models for national security risks before those models are released to the public. The three companies join OpenAI and Anthropic, both of which have renegotiated their existing 2024 partnerships under the new framework.

For an industry that has spent two years insisting voluntary self-governance is sufficient, this is a meaningful inflection point. Five of the most consequential AI labs in the world have now formally accepted a pre-deployment government review of their frontier systems. The trigger, the structure, and the implications all matter — and not all of them are obvious from the headline.

What the Agreement Actually Does

Under the new arrangement, CAISI — which now sits within the National Institute of Standards and Technology (NIST) at the Department of Commerce — will conduct pre-deployment evaluations and targeted post-release research on the frontier models built by Google DeepMind, Microsoft, and xAI. The agreements were announced by CAISI Director Chris Fall and signed under the direction of Commerce Secretary Howard Lutnick.

Specifically, CAISI will:

  • Receive access to unreleased versions of frontier AI models from each lab
  • Often work with versions stripped of standard safety guardrails, so evaluators can probe genuine capability ceilings rather than guardrailed behaviour
  • Run pre-deployment evaluations for national security and public safety risks, with particular focus on cybersecurity
  • Conduct post-deployment research to track risks as models are deployed at scale
  • Build on more than 40 model evaluations CAISI has already completed, including on cutting-edge systems not yet publicly available

The agreements with Google, Microsoft, and xAI are described by CAISI as expansions of “previously announced partnerships,” which have been renegotiated to align with priorities in the Trump administration’s AI Action Plan. OpenAI and Anthropic, which signed earlier versions of these arrangements with the predecessor body in 2024, have likewise renegotiated their agreements under the new framework. Microsoft will work directly with US government evaluators on its frontier systems, and Microsoft Chief Responsible AI Officer Natasha Crampton confirmed the company sees CAISI as adding “technical, scientific and national security expertise” beyond what Microsoft’s internal testing produces.

The Trigger: Anthropic’s Mythos Model

The proximate catalyst, by all credible reporting, was Anthropic’s Mythos model — released earlier in spring 2026, and reportedly so powerful in identifying weaknesses and security flaws within software that Anthropic itself decided to limit its rollout. Multiple outlets, including CNN and Al Jazeera, have linked Mythos directly to the political momentum behind this week’s announcement, with Mythos’s cybersecurity capabilities described as having “pushed concerns about AI’s impact on cybersecurity to a tipping point” and helped prompt the White House to formalize a review process.

That framing matters because it shifts the conversation. For most of 2024 and 2025, the dominant policy concern around frontier AI was misuse for biological or chemical weapons design, plus longer-horizon worries about loss of control. Mythos has reframed the urgent risk as cyber. A model that can identify zero-day vulnerabilities at scale represents a different — and arguably more immediate — class of national security threat than one that helps a lay user synthesize a pathogen. The CAISI agreements explicitly center cybersecurity in their public framing, which suggests the policy machine has tracked the threat model through to its current state.

How This Differs From the Biden-Era Framework

The arrangement is not entirely new. In 2024, under President Biden, the predecessor body — then called the US Artificial Intelligence Safety Institute — signed analogous voluntary agreements with OpenAI and Anthropic. That body was led by Biden tech adviser Elizabeth Kelly, who has since moved to Anthropic. Under Biden, the institute focused on developing AI tests, definitions, and voluntary safety standards — work that was research-oriented and largely advisory.

The Trump administration’s reorganization has changed the framing in three meaningful ways:

Renaming and repositioning. The “Safety Institute” became the “Center for AI Standards and Innovation,” signaling a shift from a safety-first orientation to one balancing safety with industry competitiveness and innovation. CAISI has been formally designated as the federal government’s “primary point of contact” with industry on frontier AI matters, consolidating what had been a more diffuse set of agency relationships.

Alignment with the AI Action Plan. The agreements are explicitly tied to priorities in Trump’s AI Action Plan, which emphasizes US technological leadership and competitiveness with China alongside safety considerations. This frames evaluations partly as a national-competitiveness exercise rather than purely as risk mitigation.

Expansion to new labs. Bringing Google DeepMind, Microsoft, and xAI under the same umbrella as OpenAI and Anthropic means the regime now covers effectively the entire US frontier AI industry. Notably absent are foreign labs — DeepSeek, Mistral, and others — though that gap is consistent with a national security framing that focuses on US-jurisdiction companies.

What This Means for Each Company

Google DeepMind has been releasing Gemini 3.1 Pro across enterprise and consumer surfaces through 2026, and the company “declined to comment further on the agreement” beyond CAISI’s public statement. For Google, the arrangement formalizes a relationship it has been navigating informally since the Gemini 1.0 launch.

Microsoft publishes its own model evaluations regularly and has framed the CAISI agreement as additive rather than constraining. With Microsoft’s deep integration of OpenAI models into Azure and Copilot, plus its own emerging in-house models, the company has both commercial reasons and reputational reasons to align with federal review.

xAI is the most interesting signatory. Elon Musk’s lab did not respond to requests for comment, but xAI’s participation is notable given Musk’s vocal criticism of various AI safety regimes elsewhere. The fact that xAI signed under the Trump-era framework — with Musk’s complicated political relationship with the administration as backdrop — suggests the agreements were structured to be palatable to companies that might have resisted a more interventionist Biden-era version.

OpenAI and Anthropic, both already in the regime, have renegotiated to align with the AI Action Plan. For Anthropic in particular, with Mythos being the precipitating event, the renegotiation is almost certainly more demanding than the original 2024 arrangement.

What’s Actually Being Evaluated

CAISI’s public materials describe the evaluations as covering frontier AI capabilities and their potential impact on national security and public safety. Independent reporting fills in the picture: developers frequently hand over model versions with safety guardrails stripped back, so CAISI evaluators can probe for cybersecurity vulnerabilities, dangerous capabilities in biology and chemistry, and other dual-use risks at the model’s true capability ceiling.

This methodology — called dangerous capability evaluation in the AI safety literature — is meaningfully different from product-level safety testing. A consumer-facing model with strong refusal behaviour might score safely in user-facing evaluation but contain latent capabilities that, with the right prompting or fine-tuning, could be unlocked by a sophisticated adversary. CAISI’s mandate is to find those latent capabilities before adversaries do.

The Open Questions

The new regime is real, but it is also new enough that several substantial questions remain unresolved.

Is it actually mandatory? Officially, the agreements remain voluntary — companies are entering them as commercial partners, not as regulated entities. There is no enabling legislation that requires pre-release review. If a lab decided to refuse, the consequences would be reputational and potentially affect federal contracting, but not legally binding in any clear sense.

What happens if CAISI flags a problem? The agreements describe evaluation and research, but the public materials are vague on what authority CAISI has to delay or block a model release. In practice, most observers expect a flagging-and-negotiation dynamic: CAISI identifies a concern, the lab agrees to additional mitigations, the model ships with adjustments. But there is no formal kill switch.

Will foreign labs be included? DeepSeek-V4, the most disruptive open-weights model of 2026, was released by a Chinese lab under the MIT license with full open weights on Hugging Face — entirely outside any US regulatory regime. CAISI’s framework has no purchase on releases like this, which raises a structural question about whether national-security-by-pre-release-review is even the right architecture in an era of open-weights frontier models.

How is information protected? Frontier model weights are extraordinarily valuable trade secrets. The agreements presumably include strong information-handling provisions, but the specific details of how CAISI safeguards what it learns from these evaluations — and how it shares findings with other parts of the government — are not public.

The Broader Picture

What’s happening here is the early formalization of a state capacity that didn’t exist in 2023 and existed only in nascent form in 2024. The US government is building, in real time, the technical and institutional muscle to evaluate frontier AI systems for national security risks before they are deployed. That muscle is still developing. The agreements are voluntary, the evaluation methodologies are evolving, and the legal framework around what CAISI can require is thin.

But five labs — OpenAI, Anthropic, Google DeepMind, Microsoft, and xAI — are now formally inside the tent. The Mythos episode demonstrated, in a way that abstract policy debates never quite did, that frontier capabilities can outpace voluntary safety measures. The CAISI agreements are the policy response to that demonstration.

Whether this regime hardens into something more like the FDA model — with enforceable pre-market review — or remains a voluntary partnership for the foreseeable future will be one of the defining AI governance questions of the next two years. Either way, the era of “trust us, it’s safe” is, by industry’s own consent, ending.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *