TPG
2023-11

Researchers from the University of Chicago introduce Nightshade, an optimised prompt-specific poisoning attack where poison samples look visually identical to benign images with matching text prompts.

"Data poisoning attacks manipulate training data to introduce unexpected behaviors into machine learning models at training time. For text-to-image generative models with massive training datasets, current understanding of poisoning attacks suggests that a successful attack would require injecting millions of poison samples into their training pipeline. In this paper, we show that poisoning attacks can be successful on generative models. We observe that training data per concept can be quite limited in these models, making them vulnerable to prompt-specific poisoning attacks, which target a model's ability to respond to individual prompts."

--Shawn Shan, Wenxin Ding, Josephine Passananti, Haitao Zheng, Ben Y. Zhao (https://arxiv.org/abs/2310.13828)

< Prev