Computer Science Alum Seminar by Hongbin Liu
Speaker:
Abstract:
As generative AI rapidly advances, its associated safety and security risks expand equally. Pre-training data is the cornerstone of powerful generative AI systems, where model-providers (e.g., OpenAI) often collect Internet-wide scale pre-training data from numerous data-owners (e.g., Wikipedia). My talk will cover the risks and countermeasures of unauthorized and untrusted pre-training data. In the first part, I will discuss our pre-training data-use auditing method for data-owners. Specially, Internet data may be used as pre-training data by model-providers without proper authorization from data-owners, raising significant concerns about data privacy and misuse. To tackle this, our data-use auditing method enables data-owners to verify if their data was used in pre-training, with only black-box access to a foundation model. In the second part, I will discuss how to mitigate risks of untrusted pre-training data for model-providers. In particular, I will describe our patching defense, designed to remove backdoors from foundation models once a backdoor attack has been detected. Because a backdoored foundation model can affect every intelligent application built upon it, our defense is crucial for maintaining ecosystem-wide security for generative AI. Our method effectively removes backdoors while preserving the foundation model’s utility. Finally, I will briefly discuss my other projects and future research directions.
Bio:
Hongbin Liu is a Ph.D. candidate at Duke University, where he is advised by Prof. Neil Gong. He is also a research scientist at Google DeepMind, contributing to Gemini. Hongbin earned his B.S. in Information Science from the University of Science and Technology of China in 2020. His research focuses on building safe and secure generative AI systems. He has published multiple papers in security venues such as CCS and USENIX Security, as well as AI/ML conferences such as ICLR, CVPR, and NeurIPS.