Mapping the misuse of generative AI

[ad_1]

Responsibility and safety

Published
Authors

Nahema Marchal and Rachel Xu

Abstract artwork depicting generative AI and revealing layers of insights and data

New research analyzes today's misuse of multimodal generative AI to help build safer and more responsible technologies

Generative artificial intelligence (AI) models that can generate images, text, audio, video and more are enabling a new era of creativity and commercial opportunity. But as these capabilities expand, so does the potential for their abuse, including manipulation, fraud, bullying, or harassment.

As part of our commitment to the responsible development and use of AI, we partnered with Jigsaw and Google.org to publish a new paper analyzing how generative AI technologies are being misused today. Teams at Google use this and other research to develop better protections for our generative AI technologies, among other security initiatives.

Together, we collected and analyzed nearly 200 media reports capturing public incidents of abuse published between January 2023 and March 2024. Using these reports, we defined and categorized common tactics for misusing generative AI and found new patterns in the exploitation of these technologies and compromises.

By clarifying the current threats and tactics used in different types of generative AI outputs, our work can help shape AI governance and help companies like Google and others developing AI technologies develop more comprehensive ones Support security assessments and mitigation strategies.

Highlighting the main categories of abuse

While generative AI tools represent a unique and compelling means of enhancing creativity, the ability to create tailored, realistic content has the potential to be used inappropriately by malicious actors.

By analyzing media reports, we identified two main categories of generative AI abuse tactics: exploiting generative AI capabilities and compromising generative AI systems. Examples of the technologies used included creating realistic human likenesses to emulate public figures; Compromises of the technologies included “jailbreaking” to remove model backups and using adversarial inputs to cause malfunctions.

Xem thêm  LG C5 OLED TV: what we have now to see

Relative frequency of generative AI abuse tactics in our dataset. Each case of abuse reported in the media could involve one or more tactics.

Instances of exploitation – where malicious actors exploited easily accessible consumer-level generative AI tools, often in ways that did not require advanced technical skills – were the most common in our data set. For example, we investigated a high-profile case from February 2024 in which an international company reportedly lost HK$200 million (approximately US$26 million) after an employee was tricked into making a financial transfer during an online meeting to be carried out. In this case, every other “person” in the meeting, including the company’s CFO, was actually a convincing, computer-generated impostor.

Some of the most well-known tactics we've observed, such as identity theft, fraud, and synthetic personas, predate the invention of generative AI and have long been used to influence the information ecosystem and manipulate others. However, broader access to generative AI tools can transform the costs and incentives of information manipulation, giving these age-old tactics new effectiveness and potential, particularly for those who previously lacked the technical sophistication to employ such tactics.

Identifying abuse strategies and combinations

Evidence falsification and human likeness manipulation are the most common tactics in real-world abuse cases. During the period we analyzed, most cases of generative AI abuse were used to influence public opinion, enable fraud or fraudulent activities, or generate profits.

By observing how malicious actors combine their generative AI abuse tactics to further their various goals, we have identified specific combinations of abuse and referred to these combinations as strategies.

Xem thêm  LG G4 setup ideas: easy methods to get the simplest from 2024’s hottest OLED TV

Diagram showing how malicious actors' goals (left) impact their abuse strategies (right).

Emerging forms of generative AI misuse that are not overtly malicious still raise ethical concerns. For example, new forms of political outreach are blurring the lines between authenticity and deception, such as when government officials suddenly speak various voter-friendly languages ​​without being transparent that they are using generative AI, and activists use the AI-generated voices of deceased victims to advocate for gun reform.

Although the study provides new insights into new forms of abuse, it is worth noting that this dataset is a limited sample of media reports. Media reports may prioritize high-profile incidents, which in turn may result in the dataset accounting for certain forms of abuse. Detecting or reporting cases of abuse can also be more challenging for those involved because generative AI systems are so new. The data set also does not allow for a direct comparison between the misuse of generative AI systems and traditional content creation and manipulation tactics, such as image editing or setting up “content farms” to produce large amounts of text, videos, GIFs, images, and more to create. So far, anecdotal evidence suggests that traditional content manipulation tactics remain more widespread.

Stay one step ahead of possible abuse

Our article highlights opportunities to develop initiatives that protect the public, such as: Such as promoting broad educational campaigns on generative AI, developing better interventions to protect the public from malicious actors, or forewarning people and equipping them to detect and refute the manipulative strategies used to misuse generative AI.

This research helps our teams better protect our products by influencing our development of security initiatives. On YouTube, we now require creators to tell us if their work is meaningfully altered or synthesized and appears realistic. We have also updated our election advertising policies to require advertisers to disclose if their election advertising contains material that has been digitally altered or generated.

Xem thêm  Gemma Scope: We help the security community shed light on the inner workings of language models

As we continue to expand our understanding of the malicious use of generative AI and make further technological advances, we know it is more important than ever to ensure our work does not occur in a silo. We recently joined the Content for Coalition Provenance and Authenticity (C2PA) as a steering committee member to help develop the technical standard and advance the adoption of content credentials. This is tamper-proof metadata that shows how content was created and edited over time.

In parallel, we are also conducting research that advances existing red teaming efforts, including improving best practices for testing the security of large language models (LLMs) and developing breakthrough tools to more easily identify AI-generated content, such as: B. SynthID, which is being integrated into a growing product range.

In recent years, Jigsaw has conducted research with misinformation creators to understand the tools and tactics they use, developed prebunking videos to warn people about manipulation attempts, and demonstrated that prebunking campaigns increase resilience to misinformation at scale scope can improve. This work is part of Jigsaw's broader portfolio of information interventions to help people protect themselves online.

By proactively addressing potential misuse, we can promote the responsible and ethical use of generative AI while minimizing its risks. We hope these insights into the most common abuse tactics and strategies will help researchers, policymakers, and industry trust and safety teams develop safer, more responsible technologies and develop better measures to combat abuse.

Acknowledgments

This research was a collaborative effort between Nahema Marchal, Rachel Botvinick, Canfer Akbulut, Harry Law, Sébastien Krier, Ziad Reslan, Boxi Wu, Frankie Garcia, and Jennie Brennan.

[ad_2]

Source link

By

Trả lời

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *