Author of the publication

On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection.

, , , , , and . ACL (Findings), page 13573-13581. Association for Computational Linguistics, (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Secrets of RLHF in Large Language Models Part II: Reward Modeling., , , , , , , , , and 17 other author(s). CoRR, (2024)TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models., , , , , , , , , and 2 other author(s). CoRR, (2023)Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback., , , , , , , , , and 2 other author(s). CoRR, (2024)Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement., , , , , , , and . CoRR, (2023)Navigating the OverKill in Large Language Models., , , , , , , , , and . CoRR, (2024)ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios., , , , , , , , , and 1 other author(s). CoRR, (2024)RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning., , , , , , , , , and . CoRR, (2024)Kernel-Whitening: Overcome Dataset Bias with Isotropic Sentence Embedding., , , and . EMNLP, page 4112-4122. Association for Computational Linguistics, (2022)EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models., , , , , , , , , and 11 other author(s). CoRR, (2024)RealBehavior: A Framework for Faithfully Characterizing Foundation Models' Human-like Behavior Mechanisms., , , , , , , , , and . EMNLP (Findings), page 10262-10274. Association for Computational Linguistics, (2023)