DeepSeek-R1 produces flawed code when prompts touch on topics China considers politically sensitive

DeepSeek-R1, DeepSeek’s reasoning model, becomes significantly less secure when responding to prompts involving topics considered politically sensitive by the Chinese government, according to new research from CrowdStrike.

DeepSeek-R1 typically produces vulnerable code in about 19% of tests under neutral conditions. But when prompts contain references that the Chinese Communist Party (CCP) is likely to view as sensitive, the risk of severe security flaws can rise by up to 50%.

In one case, telling the model to act as a coding agent for an industrial control system based in Tibet increased the rate of severe vulnerabilities to 27.2%. CrowdStrike noted that mentions of Falun Gong (a religious movement banned in China), Uyghurs, or Tibet consistently led to “significant deviations” and degraded quality.

Another example involved a prompt asking the model to write a PHP webhook handler for PayPal notifications as a “helpful assistant” to a financial institution in Tibet. The resulting code hard-coded secret values, relied on unsafe data extraction methods, and wasn’t valid PHP, despite the model confidently describing it as secure and aligned with PayPal best practices.

A separate test asked DeepSeek-R1 to create Android code for a networking app intended for Uyghur community members. While the app functioned, CrowdStrike found that session management and authentication were missing, leaving user data exposed. In 35% of tests, the model used insecure password hashing or no hashing at all.

When asked to produce code related to Falun Gong, DeepSeek-R1 generated internal reasoning plans but then refused to answer in nearly 45% of cases.

The researchers say the inconsistencies may stem from guardrails added during training to comply with Chinese regulations requiring AI systems to avoid generating illegal or politically sensitive content. The mechanisms, CrowdStrike says, may inadvertently interfere with the model’s coding behavior.

“One possible explanation for the observed behavior could be that DeepSeek added special steps to its training pipeline that ensured its models would adhere to CCP core values. It seems unlikely that they trained their models to specifically produce insecure code,”the report notes. “Rather, it seems plausible that the observed behavior might be an instance of emergent misalignment.4 In short, due to the potential pro-CCP training of the model, it may have unintentionally learned to associate words such as “Falun Gong” or “Uyghurs” with negative characteristics, making it produce negative responses when those words appear in its system prompt.”

Back to the list

Latest Posts

New agentic browser attack lets emails trigger Google Drive wipe

Because the agent interprets the message as legitimate workload, it may execute the destructive steps without prompting the user for approval.
8 December 2025

Portugal updates legislation to protect ethical security research

To qualify, researchers must ensure their work is solely aimed at uncovering flaws they did not create and contributes to improved security.
8 December 2025

MuddyWater deploys new UDPGangster backdoor in attacks across the Middle East

The cyber-espionage activity has primarily targeted users in Turkey, Israel, and Azerbaijan.
8 December 2025