***www.GovBench.ai 🇺🇸** | AI + Government Alignment | Nonprofit 501(c)(6)*
Audience: Federal, State, and Local CAIOs/CIOs; FedRAMP PMO; DoD/DISA Authorizing Officials; ODNI
Threat Level: MEDIUM
Action Required
- Ban production use of PRC-trained LLMs across all government environments.
- Mandate cloud providers (AWS Bedrock, Azure ML, Google Vertex, etc.) remove PRC-origin models from FedRAMP/P-ATO libraries.
- Restrict PRC-trained LLMs to research enclaves only, under strict controls.
Key Takeaway
<aside>
PRC-trained LLMs appear high-performing but embed hidden guardrails aligned with Chinese state directives. These can be triggered by the very content U.S. personnel must analyze, creating silent mission risks. Removing them from production is essential to protect trust.
</aside>
Rationale
- High performance, high risk. PRC-origin large language models (LLMs) such as DeepSeek have demonstrated stronger benchmark performance than nearly all Western-origin open-weight LLMs—even on U.S. government & military tasks—making them attractive to employees.
- PRC values alignment requirements. The PRC's Generative AI Measures require adherence to "socialist core values," security assessments, and algorithm filings—obligations that can shape content moderation, refusal behavior, and narrative sourcing.
- Hidden trigger risks. Testing shows that inserting Chinese-censored terms into prompts can flip PRC-origin LLMs into a mode-switch state—producing refusals, boilerplate disclaimers, or state-aligned narratives. In some cases, performance on U.S. Government & Military tasks dropped by up to 50% with only a handful of these terms present. These triggers can appear unintentionally (e.g., when analyzing China-related documents) or be planted maliciously through prompt injection.
- Unbounded risk surface. The full list of censored terms is unknown. This makes PRC-trained models unpredictable in mission environments. Without a definitive keyword list, these models cannot be reliably hardened against failure modes.
Implementation Guidance
- Model Registry Controls
- Purge PRC-origin models from production registries and CI/CD pipelines.
- Cloud Service Controls
- Instruct FedRAMP cloud service providers to remove PRC-origin models from Bedrock, AzureML, Vertex AI, and similar libraries.
- Research-Only Exception
- Permit PRC-trained models in research enclaves under the same controls as foreign-developed software samples: isolated networks, no sensitive data, and full prompt/output logging.
- Red Team / Evaluation
- Direct CAIOs to include censored-term trigger testing as part of pre-deployment AI evaluation.
- Track and report refusal/narrative shifts during internal benchmarks.
Ban list: China-Trained LLMs