Mintong Kang

Publications

Selected and Recent Papers
All

DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents Data

Zhaorun Chen*, Xun Liu*, Haibo Tong*, Chengquan Guo*, Yuzhou Nie*, Jiawei Zhang*, Mintong Kang*, Chejian Xu*, Qichang Liu*, Xiaogeng Liu*, Tianneng Shi*, Chaowei Xiao, Sanmi Koyejo, Percy Liang, Wenbo Guo, Dawn Song, Bo Li

arXiv preprint, 2026

Mitigating Indirect Prompt Injection via Instruction-Following Intent Analysis

Mintong Kang, Chong Xiang, Sanjay Kariyappa, Chaowei Xiao, Bo Li, Edward Suh

arXiv preprint, 2026

C-SafeGen: Certified Safe LLM Generation with Claim-Based Streaming Guardrails Code

Mintong Kang, Zhaorun Chen, Bo Li

Advances in Neural Information Processing Systems (NeurIPS 2025)

Poly-Guard: Massive Multi-Domain Safety Policy-Grounded Guardrail Dataset Code Data

Mintong Kang*, Zhaorun Chen*, Chejian Xu*, Jiawei Zhang*, Chengquan Guo*, Minzhou Pan, Ivan Revilla, Yu Sun, Bo Li

Advances in Neural Information Processing Systems (NeurIPS 2025)

AdvAgent: Controllable Blackbox Red-teaming on Web Agents Code

Chejian Xu, Mintong Kang, Jiawei Zhang, Zeyi Liao, Lingbo Mo, Mengqi Yuan, Huan Sun, Bo Li

International Conference on Machine Learning (ICML 2025)

R²-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning Code

Mintong Kang, Bo Li

International Conference on Learning Representations (ICLR 2025 Spotlight)

C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models Code

Mintong Kang, Nezihe Merve Gürel, Ning Yu, Dawn Song, Bo Li

International Conference on Machine Learning (ICML 2024)

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models Code Data

Boxin Wang*, Weixin Chen*, Hengzhi Pei*, Chulin Xie*, Mintong Kang*, Chenhui Zhang*, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, Sang T. Truong, Simran Arora, Mantas Mazeika, Dan Hendrycks, Zinan Lin, Yu Cheng, Sanmi Koyejo, Dawn Song, Bo Li

Advances in Neural Information Processing Systems (NeurIPS 2023 Outstanding Paper Award)

DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents Data

arXiv preprint, 2026

AudioGuard: Toward Comprehensive Audio Safety Protection Across Diverse Threat Models

Mintong Kang, Chen Fang, Bo Li

arXiv preprint, 2026

Mitigating Indirect Prompt Injection via Instruction-Following Intent Analysis

Mintong Kang, Chong Xiang, Sanjay Kariyappa, Chaowei Xiao, Bo Li, Edward Suh

arXiv preprint, 2026

ARMs: Adaptive Red-Teaming Agent against Multimodal Models with Plug-and-Play Attacks

Zhaorun Chen*, Xun Liu*, Mintong Kang, Jiawei Zhang, Minzhou Pan, Shuang Yang, Bo Li

International Conference on Learning Representations (ICLR 2026)

C-SafeGen: Certified Safe LLM Generation with Claim-Based Streaming Guardrails Code

Mintong Kang, Zhaorun Chen, Bo Li

Advances in Neural Information Processing Systems (NeurIPS 2025)

Poly-Guard: Massive Multi-Domain Safety Policy-Grounded Guardrail Dataset Code Data

Mintong Kang*, Zhaorun Chen*, Chejian Xu*, Jiawei Zhang*, Chengquan Guo*, Minzhou Pan, Ivan Revilla, Yu Sun, Bo Li

Advances in Neural Information Processing Systems (NeurIPS 2025)

AdvAgent: Controllable Blackbox Red-teaming on Web Agents Code

Chejian Xu, Mintong Kang, Jiawei Zhang, Zeyi Liao, Lingbo Mo, Mengqi Yuan, Huan Sun, Bo Li

International Conference on Machine Learning (ICML 2025)

ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning

Zhaorun Chen, Mintong Kang, Bo Li

International Conference on Machine Learning (ICML 2025)

R²-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning Code

Mintong Kang, Bo Li

International Conference on Learning Representations (ICLR 2025 Spotlight)

AdvWave: Stealthy Adversarial Jailbreak Attack against Large Audio-Language Models Code

Mintong Kang, Chejian Xu, Bo Li

International Conference on Learning Representations (ICLR 2025)

FairGen: Controlling Sensitive Attributes for Fair Generations in Diffusion Models via Adaptive Latent Guidance

Mintong Kang, Vinayshekhar Bannihatti Kumar, Shamik Roy, Abhishek Kumar, Sopan Khosla, Balakrishnan Murali Narayanaswamy, Rashmi Gangadharaiah

Conference on Empirical Methods in Natural Language Processing (EMNLP 2025)

MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models

Chejian Xu*, Jiawei Zhang*, Zhaorun Chen*, Chulin Xie*, Mintong Kang*, Zhuowen Yuan*, Zidi Xiong*, Chenhui Zhang, Lingzhi Yuan, Yi Zeng, Peiyang Xu, Chengquan Guo, Andy Zhou, Jeffrey Ziwei Tan, Zhen Xiang, Zinan Lin, Dan Hendrycks, Dawn Song, Bo Li

International Conference on Learning Representations (ICLR 2025)

EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage Code

Lingbo Mo*, Zeyi Liao*, Chejian Xu, Mintong Kang, Jiawei Zhang, Chaowei Xiao, Bo Li, Huan Sun

International Conference on Learning Representations (ICLR 2025)

C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models Code

Mintong Kang, Nezihe Merve Gürel, Ning Yu, Dawn Song, Bo Li

International Conference on Machine Learning (ICML 2024)

Rob-FCP: Certifiably Byzantine-Robust Federated Conformal Prediction

Mintong Kang, Zhen Lin, Jimeng Sun, Cao Xiao, Bo Li

International Conference on Machine Learning (ICML 2024)

COLEP: Certifiably Robust Learning-Reasoning Conformal Prediction via Probabilistic Circuits

Mintong Kang*, Nezihe Merve Gürel*, Linyi Li, Bo Li