Prompt-Driven LLM Safeguarding via Directed Representation Optimization
Chujie Zheng, Fan Yin, Hao Zhou, Fandong Meng, Jie Zhou, Md. Rizwan Parvez, Minlie Huang, and Nanyun Peng, in ICML, 2024.
Abstract
Bib Entry
@inproceedings{zheng2024prompt,
title = {Prompt-Driven LLM Safeguarding via Directed Representation Optimization},
author = {Zheng, Chujie and Yin, Fan and Zhou, Hao and Meng, Fandong and Zhou, Jie and Parvez, Md. Rizwan and Huang, Minlie and Peng, Nanyun},
year = {2024},
booktitle = {ICML}
}
Related Publications