Security researchers have unveiled a novel attack technique that exploits manipulated images to compromise confidential user data. Developed by Kikimora Morozova and Suha Sabi Hussain from the cybersecurity firm Trail of Bits, this method builds upon concepts introduced in a 2020 study by TU Braunschweig. The researchers have successfully demonstrated how this technique can be applied to contemporary AI systems, raising significant concerns about data security.
The attack method capitalizes on the tendency of AI systems to automatically downscale uploaded images, a process designed to conserve computing resources and reduce costs. Common resampling algorithms, such as “Nearest Neighbor”, “Bilinear”, and “Bicubic”, are employed in this process. These algorithms can reveal hidden patterns within the original image when it is resized. Consequently, a maliciously altered image may contain covert instructions that become discernible only after downscaling, presenting a challenge for human users to detect.
In one illustrative example, the researchers noted that dark areas in an image could be transformed to red during the downscaling process, unveiling concealed black text. This text, once interpreted by the AI model, is treated as legitimate user input. Although the interface appears normal to users, in the background, harmful commands may execute, potentially leading to the unauthorized use of sensitive information.
In their tests, the researchers successfully forwarded calendar data from a Google account to an external email address using the “Gemini CLI” tool. This vulnerability affects multiple platforms, including Google’s Gemini models, the Google Assistant on Android, and the Genspark service.
To highlight the risks, Trail of Bits has developed an open-source tool called “Published,” designed to create images specifically tailored for different downscaling methods. This tool serves as a demonstration of how such attacks can be conducted and the potential consequences.
In light of these findings, the researchers offered several recommendations to mitigate the risks associated with these image-based attacks. They advise limiting the size of images during uploads and providing a preview of the reduced version to users. Furthermore, critical actions should not be executed automatically; instead, they should require user confirmation, particularly when extracting text from images.
The researchers emphasized that a robust system design is essential to defend against prompt injection attacks. By implementing systematic protective measures, users and organizations can significantly reduce the risk of multimodal AI applications becoming conduits for data exploitation.
With the increasing integration of AI technology in various sectors, ensuring the security of user data remains a pressing challenge. The insights from Kikimora Morozova and Suha Sabi Hussain underscore the need for vigilance and proactive measures in safeguarding against emerging threats in the digital landscape.
