Promptware Attacks Pose Security Risks for AI Models and Assistants

A new class of vulnerabilities targeting large language models has been identified, revealing how attackers can manipulate AI systems like Google’s Gemini to perform malicious activities. Recent research by security company SafeBreach details how such prompt-based attacks can be carried out.

Promptware involves carefully crafted inputs—texts, images, or audio—that are designed to manipulate an AI’s interpretation of instructions. Unlike traditional cyberattacks that exploit system flaws or memory issues, Promptware attacks target the AI’s reasoning process itself. These attacks are often more accessible and can have severe consequences.

While seen as uncommon, recent findings demonstrate that attackers can embed malicious prompts into everyday resources such as emails or calendar invites. When AI assistants process these resources, they may execute harmful instructions without detection.

The research focused on Google Gemini, an AI assistant used across web, mobile, and voice interfaces. Attackers have found ways to embed prompts into common items like calendar events or emails, tricking Gemini into doing unintented actions with no obvious signs to the user such as:

  • Sending spam or phishing messages
  • Generating toxic or harmful content
  • Controlling connected devices (e.g., opening windows, turning on appliances)
  • Tracking user location
  • Extracting sensitive data from emails or applications
  • Chaining commands across multiple platforms and tools

The core techniques involve indirect prompt injection and context poisoning. Attackers embed malicious instructions within shared resources that Gemini retrieves and processes during interactions.

Google responded promptly upon notification of these vulnerabilities, deploying measures such as improved input validation, sanitization, and prompt detection mechanisms.

AI systems offer significant benefits, but it also introduces new risks. And as they continue to evolve, ongoing review, safeguards and security updates remain essential.

Developers should focus on safeguards such as strengthening input validation, improving context awareness, and incorporating user confirmation steps to prevent prompt injection, and organizations that rely on AI assistants should run security assessments and adopt layered defenses to protect their data and assets.

To explore the full details of these vulnerabilities, visit SafeBreach’s blog report here.


Comments Section

Leave a Reply

Your email address will not be published. Required fields are marked *


Back to Top - Modernizing Tech