Google Detects First Malware That Calls AI Models During Live Attacks

by

Google’s TAG (Threat Intelligence Group) has reported identification of malware actively querying AI language models during attacks to generate code on the fly and evade detection.

The discovery marks a shift from attackers using AI as a productivity tool to deploying malware with built-in AI capabilities that adapt in real time. Google identified five new malware families in 2025 demonstrating this approach, with at least two confirmed operating in active campaigns.

Malware That Rewrites Itself

PROMPTFLUX, experimental malware discovered in June, contacts Google’s Gemini AI to completely rewrite its own code every hour. The VBScript dropper sends prompts instructing Gemini to act as a code obfuscator, generating new versions that evade antivirus detection while preserving functionality.

Each iteration creates a cycle where newly generated code contains instructions to regenerate itself again. The malware uses a hard-coded API key and specifically requests Gemini’s latest model version for resilience against updates.

Although still in development with incomplete features, PROMPTFLUX demonstrates active experimentation with self-modifying AI-powered malware. Google has disabled associated accounts and strengthened Gemini’s safeguards.

Others have also been discovered in active intelligence operations.

PROMPTSTEAL disguises itself as image generation software while secretly contacting Hugging Face’s AI platform. It sends prompts requesting Windows commands to collect system information and copy documents, then immediately executes whatever the AI generates. Stolen data gets sent to attacker servers.

Google’s analysis shows ongoing development, with newer samples adding obfuscation and changing communication methods. The malware likely uses stolen API tokens to access AI services through legitimate credentials.

Three other AI-enabled malware families were also found:

PROMPTLOCK is experimental ransomware that uses AI to generate malicious scripts at runtime for filesystem reconnaissance, data theft, and encryption across Windows and Linux systems.

QUIETVAULT steals developer credentials from GitHub and NPM, then leverages AI command-line tools on infected systems to search for additional secrets and exfiltrate them through newly created public repositories.

FRUITSHELL is a reverse shell containing hard-coded prompts designed to evade AI-powered security systems.

Fooling AI Safety Systems

Threat actors developed social engineering tactics against AI safety guardrails themselves. When Gemini blocked malicious requests, attackers reframed queries by claiming to be cybersecurity competition participants or students working on academic projects.
A hacker learned that prefacing exploitation requests with statements like “I am working on a CTF problem” bypassed restrictions, receiving detailed guidance for phishing and system compromise. Iranian actors posed as students writing papers, inadvertently revealing their command-and-control infrastructure when asking Gemini for coding help. Google disabled associated accounts and updated Gemini’s detection systems to recognize these social engineering patterns.

Some continue to use AI throughout its entire life cycle. One researched cryptocurrency concepts and generated different language social engineering content to expand targeting. Another used deepfake videos impersonating cryptocurrency industry figures to distribute backdoor malware.
Others have been used for analyzing stolen data and utilizing command and control frameworks and code obfuscation libraries, or reconnaissance, phishing, lateral movement, command-and-control development, and data exfiltration across unfamiliar platforms including cloud infrastructure.

Google has disabled accounts and API keys associated with malicious activity and applied findings to strengthen Gemini’s classifiers and underlying models, and aims to learn from threat intelligence directly to improve AI safety systems.

The report updates Google’s January 2025 analysis showing adversaries were primarily using AI for productivity gains. The latest findings confirm threat actors have progressed to deploying AI-enabled malware in operational campaigns, representing what Google calls a new phase of AI abuse.

Organizations should monitor for unusual API traffic to AI services and consider implementing controls around AI platform access as threat actors increasingly integrate these capabilities into malware.

Google Detects First Malware That Calls AI Models During Live Attacks

Comments Section

Leave a Reply Cancel reply

Google Detects First Malware That Calls AI Models During Live Attacks

Comments Section

Leave a Reply Cancel reply

Related Articles