TY - GEN
T1 - FANTASTIC COPYRIGHTED BEASTS AND HOW (NOT) TO GENERATE THEM
AU - He, Luxi
AU - Huang, Yangsibo
AU - Shi, Weijia
AU - Xie, Tinghao
AU - Liu, Haotian
AU - Wang, Yue
AU - Zettlemoyer, Luke
AU - Zhang, Chiyuan
AU - Chen, Danqi
AU - Henderson, Peter
N1 - Publisher Copyright:
© 2025 13th International Conference on Learning Representations, ICLR 2025. All rights reserved.
PY - 2025
Y1 - 2025
N2 - Recent studies show that image and video generation models can be prompted to reproduce copyrighted content from their training data, raising serious legal concerns about copyright infringement. Copyrighted characters (e.g., Mario, Batman) present a significant challenge: at least one lawsuit has already awarded damages based on the generation of such characters. Consequently, commercial services like DALL·E have started deploying interventions. However, little research has systematically examined these problems: (1) Can users easily prompt models to generate copyrighted characters, even if it is unintentional?; (2) How effective are the existing mitigation strategies? To address these questions, we introduce a novel evaluation framework with metrics that assess both the generated image's similarity to copyrighted characters and its consistency with user intent, grounded in a set of popular copyrighted characters from diverse studios and regions. We show that state-of-the-art image and video generation models can still generate characters even if characters' names are not explicitly mentioned, sometimes with only two generic keywords (e.g., prompting with “videogame, plumber” consistently generates Nintendo's Mario character). We also introduce semi-automatic techniques to identify such keywords or descriptions that trigger character generation. Using this framework, we evaluate mitigation strategies, including prompt rewriting and new approaches we propose. Our findings reveal that common methods, such as DALL·E's prompt rewriting, are insufficient alone and require supplementary strategies like negative prompting. Our work provides empirical grounding for discussions on copyright mitigation strategies and offers actionable insights for model deployers implementing these safeguards.
AB - Recent studies show that image and video generation models can be prompted to reproduce copyrighted content from their training data, raising serious legal concerns about copyright infringement. Copyrighted characters (e.g., Mario, Batman) present a significant challenge: at least one lawsuit has already awarded damages based on the generation of such characters. Consequently, commercial services like DALL·E have started deploying interventions. However, little research has systematically examined these problems: (1) Can users easily prompt models to generate copyrighted characters, even if it is unintentional?; (2) How effective are the existing mitigation strategies? To address these questions, we introduce a novel evaluation framework with metrics that assess both the generated image's similarity to copyrighted characters and its consistency with user intent, grounded in a set of popular copyrighted characters from diverse studios and regions. We show that state-of-the-art image and video generation models can still generate characters even if characters' names are not explicitly mentioned, sometimes with only two generic keywords (e.g., prompting with “videogame, plumber” consistently generates Nintendo's Mario character). We also introduce semi-automatic techniques to identify such keywords or descriptions that trigger character generation. Using this framework, we evaluate mitigation strategies, including prompt rewriting and new approaches we propose. Our findings reveal that common methods, such as DALL·E's prompt rewriting, are insufficient alone and require supplementary strategies like negative prompting. Our work provides empirical grounding for discussions on copyright mitigation strategies and offers actionable insights for model deployers implementing these safeguards.
UR - https://www.scopus.com/pages/publications/105010216850
UR - https://www.scopus.com/pages/publications/105010216850#tab=citedBy
M3 - Conference contribution
AN - SCOPUS:105010216850
T3 - 13th International Conference on Learning Representations, ICLR 2025
SP - 36670
EP - 36697
BT - 13th International Conference on Learning Representations, ICLR 2025
PB - International Conference on Learning Representations, ICLR
T2 - 13th International Conference on Learning Representations, ICLR 2025
Y2 - 24 April 2025 through 28 April 2025
ER -