Google Launches Red Team to Secure AI Systems Against Attacks

by Jeffrey Burt on July 24, 2023

Google is rolling out a red team charged with testing the security of AI systems by running simulated but realistic attacks to uncover vulnerabilities or other weaknesses that could be exploited by cybercriminals.

The recent unveiling of the AI Red Team comes weeks after the IT and cloud giant introduced its Secure AI Framework (SAIF). SAIF aims to give the industry the tools needed to push back against risks to AI systems and develop security standards to ensure AI technology is developed responsibly.

In addition, Google was one of seven companies developing AI technologies that agreed last week to a set of safeguards developed by the White House to certify that their AI products are safe. The other companies were Microsoft, Amazon, Meta and OpenAI; startups Inflection and Anthropic also signed onto the plan.

Google’s latest move also is part of its larger effort to evolve its red team operations to match the needs of emerging innovations in technology. In this case, it’s AI, and the new group will work in concert with Google’s existing red teams, according to Daniel Fabian, head of Google Red Teams.

“The AI Red Team is closely aligned with traditional red teams but also has the necessary AI subject matter expertise to carry out complex technical attacks on AI systems,” Fabian wrote in a recent blog post. “To ensure that they are simulating realistic adversary activities, our team leverages the latest insights from world-class Google Threat Intelligence teams like Mandiant and the Threat Analysis Group (TAG), content abuse red teaming in Trust & Safety, and research into the latest attacks from Google DeepMind [AI research lab].”

Building on a Common Idea

Red teams are nothing new. They’re essentially groups of hackers deployed by companies capable of throwing every kind of simulated attack–from nation-state adversaries and advanced persistent threat (APT) groups to hacktivists and insider threats–with the goal of showing the organizations where they’re vulnerable and where they need to bulk up their security.

Given the accelerated development of generative AI technologies and the way bad actors can use them in their attacks, the need for an AI-focused red team was clear, Fabian wrote.

“Exercises can raise findings across security, privacy and abuse disciplines, depending on where and how the technology is deployed,” he wrote. “To identify these opportunities to improve safety, we leverage attackers’ tactics, techniques and procedures (TTPs) to test a range of system defenses.”

Google detailed some of those TTPs that have been used by malicious groups–including prompts attacks, training data extraction, data poisoning and backdooring the model–in an accompanying report outlining the important role red teams can take to secure AI systems.

Prompt engineering is used to create prompts that effectively instruct the large language models (LLMs) that are the basis behind generative AI applications like OpenAI’s highly popular ChatGPT to perform a task. It plays an important role in successful LLM projects but also can be used by bad actors to include untrusted sources in a prompt “to influence the behavior of the model, and hence the output in ways that were not intended by the application,” according to the report.

In addition, backdooring a model involves an adversary using a specific “trigger word or feature”–the backdoor–to covertly change the way an LLM behaves to force it to produce incorrect outputs.

“In data poisoning attacks, an attacker manipulates the training data of the model to influence the model’s output according to the attacker’s preference,” Google wrote in the report. “Because of this, securing the data supply chain is just as important for AI security as the software supply chain. Training data may be poisoned in various places in the development pipeline.”

Adversarial examples and data exfiltration also are tactics used by bad actors against AI systems.

AI Expertise is Worth the Investment

“We’ve already seen early indications that investments in AI expertise and capabilities in adversarial simulations are highly successful,” Fabian wrote. “Red team engagements, for example, have highlighted potential vulnerabilities and weaknesses, which helped anticipate some of the attacks we now see on AI systems.”

Among the lessons learned is that expertise in AI will be critical for red teams given how complex AI systems are becoming and that it’s important to use red teams’ findings in R&D efforts. At the same time, enterprises don’t need to abandon what they already do. Using traditional security controls like properly locking down systems and models can reduce risk.

Also, “many attacks on AI systems can be detected in the same way as traditional attacks,” he wrote.

Jeffrey Burt

Jeffrey Burt has been a journalist for more than three decades, writing about technology since 2000. He’s written for a variety of outlets, including eWEEK, The Next Platform, The Register, The New Stack, eSecurity Planet, and Channel Insider.

jeffrey-burt has 12 posts and counting.See all posts by jeffrey-burt