Mintong Kang

Mintong Kang

Ph.D. Candidate, Computer Science, UIUC

mintong2 [AT] illinois.edu

Bio

I am a Ph.D. Candidate in Computer Science at the University of Illinois Urbana-Champaign, advised by Prof. Bo Li. I develop safety and security methods for advanced AI systems, with a focus on automatic red-teaming, robust guardrails, and certifiable safety guarantees for LLMs, multimodal models, audio-language models, and agentic AI systems.

I am currently a Student Researcher at Google DeepMind. Previously, I was a Research Intern at NVIDIA Research and Amazon AWS AI.

My work spans both attacks and defenses: uncovering emerging risks in frontier models and building reliable mechanisms to mitigate them through reasoning, policy grounding, and formal certification. My co-first-authored work DecodingTrust received the NeurIPS 2023 Outstanding Paper Award.

Publications

DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents Data

Zhaorun Chen*, Xun Liu*, Haibo Tong*, Chengquan Guo*, Yuzhou Nie*, Jiawei Zhang*, Mintong Kang*, Chejian Xu*, Qichang Liu*, Xiaogeng Liu*, Tianneng Shi*, Chaowei Xiao, Sanmi Koyejo, Percy Liang, Wenbo Guo, Dawn Song, Bo Li

arXiv preprint, 2026

Mitigating Indirect Prompt Injection via Instruction-Following Intent Analysis

Mintong Kang, Chong Xiang, Sanjay Kariyappa, Chaowei Xiao, Bo Li, Edward Suh

arXiv preprint, 2026

C-SafeGen: Certified Safe LLM Generation with Claim-Based Streaming Guardrails Code

Mintong Kang, Zhaorun Chen, Bo Li

Advances in Neural Information Processing Systems (NeurIPS 2025)

Poly-Guard: Massive Multi-Domain Safety Policy-Grounded Guardrail Dataset Code Data

Mintong Kang*, Zhaorun Chen*, Chejian Xu*, Jiawei Zhang*, Chengquan Guo*, Minzhou Pan, Ivan Revilla, Yu Sun, Bo Li

Advances in Neural Information Processing Systems (NeurIPS 2025)

AdvAgent: Controllable Blackbox Red-teaming on Web Agents Code

Chejian Xu, Mintong Kang, Jiawei Zhang, Zeyi Liao, Lingbo Mo, Mengqi Yuan, Huan Sun, Bo Li

International Conference on Machine Learning (ICML 2025)

R2-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning Code

Mintong Kang, Bo Li

International Conference on Learning Representations (ICLR 2025 Spotlight)

C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models Code

Mintong Kang, Nezihe Merve Gürel, Ning Yu, Dawn Song, Bo Li

International Conference on Machine Learning (ICML 2024)

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models Code Data

Boxin Wang*, Weixin Chen*, Hengzhi Pei*, Chulin Xie*, Mintong Kang*, Chenhui Zhang*, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, Sang T. Truong, Simran Arora, Mantas Mazeika, Dan Hendrycks, Zinan Lin, Yu Cheng, Sanmi Koyejo, Dawn Song, Bo Li

Advances in Neural Information Processing Systems (NeurIPS 2023 Outstanding Paper Award)

DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents Data

Zhaorun Chen*, Xun Liu*, Haibo Tong*, Chengquan Guo*, Yuzhou Nie*, Jiawei Zhang*, Mintong Kang*, Chejian Xu*, Qichang Liu*, Xiaogeng Liu*, Tianneng Shi*, Chaowei Xiao, Sanmi Koyejo, Percy Liang, Wenbo Guo, Dawn Song, Bo Li

arXiv preprint, 2026

Mitigating Indirect Prompt Injection via Instruction-Following Intent Analysis

Mintong Kang, Chong Xiang, Sanjay Kariyappa, Chaowei Xiao, Bo Li, Edward Suh

arXiv preprint, 2026

ARMs: Adaptive Red-Teaming Agent against Multimodal Models with Plug-and-Play Attacks

Zhaorun Chen*, Xun Liu*, Mintong Kang, Jiawei Zhang, Minzhou Pan, Shuang Yang, Bo Li

International Conference on Learning Representations (ICLR 2026)

C-SafeGen: Certified Safe LLM Generation with Claim-Based Streaming Guardrails Code

Mintong Kang, Zhaorun Chen, Bo Li

Advances in Neural Information Processing Systems (NeurIPS 2025)

Poly-Guard: Massive Multi-Domain Safety Policy-Grounded Guardrail Dataset Code Data

Mintong Kang*, Zhaorun Chen*, Chejian Xu*, Jiawei Zhang*, Chengquan Guo*, Minzhou Pan, Ivan Revilla, Yu Sun, Bo Li

Advances in Neural Information Processing Systems (NeurIPS 2025)

AdvAgent: Controllable Blackbox Red-teaming on Web Agents Code

Chejian Xu, Mintong Kang, Jiawei Zhang, Zeyi Liao, Lingbo Mo, Mengqi Yuan, Huan Sun, Bo Li

International Conference on Machine Learning (ICML 2025)

ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning

Zhaorun Chen, Mintong Kang, Bo Li

International Conference on Machine Learning (ICML 2025)

R2-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning Code

Mintong Kang, Bo Li

International Conference on Learning Representations (ICLR 2025 Spotlight)

AdvWave: Stealthy Adversarial Jailbreak Attack against Large Audio-Language Models Code

Mintong Kang, Chejian Xu, Bo Li

International Conference on Learning Representations (ICLR 2025)

FairGen: Controlling Sensitive Attributes for Fair Generations in Diffusion Models via Adaptive Latent Guidance

Mintong Kang, Vinayshekhar Bannihatti Kumar, Shamik Roy, Abhishek Kumar, Sopan Khosla, Balakrishnan Murali Narayanaswamy, Rashmi Gangadharaiah

Conference on Empirical Methods in Natural Language Processing (EMNLP 2025)

MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models

Chejian Xu*, Jiawei Zhang*, Zhaorun Chen*, Chulin Xie*, Mintong Kang*, Zhuowen Yuan*, Zidi Xiong*, Chenhui Zhang, Lingzhi Yuan, Yi Zeng, Peiyang Xu, Chengquan Guo, Andy Zhou, Jeffrey Ziwei Tan, Zhen Xiang, Zinan Lin, Dan Hendrycks, Dawn Song, Bo Li

International Conference on Learning Representations (ICLR 2025)

EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage Code

Lingbo Mo*, Zeyi Liao*, Chejian Xu, Mintong Kang, Jiawei Zhang, Chaowei Xiao, Bo Li, Huan Sun

International Conference on Learning Representations (ICLR 2025)

C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models Code

Mintong Kang, Nezihe Merve Gürel, Ning Yu, Dawn Song, Bo Li

International Conference on Machine Learning (ICML 2024)

Rob-FCP: Certifiably Byzantine-Robust Federated Conformal Prediction

Mintong Kang, Zhen Lin, Jimeng Sun, Cao Xiao, Bo Li

International Conference on Machine Learning (ICML 2024)

COLEP: Certifiably Robust Learning-Reasoning Conformal Prediction via Probabilistic Circuits

Mintong Kang*, Nezihe Merve Gürel*, Linyi Li, Bo Li

International Conference on Learning Representations (ICLR 2024)

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models Code Data

Boxin Wang*, Weixin Chen*, Hengzhi Pei*, Chulin Xie*, Mintong Kang*, Chenhui Zhang*, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, Sang T. Truong, Simran Arora, Mantas Mazeika, Dan Hendrycks, Zinan Lin, Yu Cheng, Sanmi Koyejo, Dawn Song, Bo Li

Advances in Neural Information Processing Systems (NeurIPS 2023 Outstanding Paper Award)

DiffAttack: Evasion Attacks Against Diffusion-Based Adversarial Purification Code

Mintong Kang, Dawn Song, Bo Li

Advances in Neural Information Processing Systems (NeurIPS 2023)

FaShapley: Fast and Approximated Shapley Based Model Pruning Towards Certifiably Robust DNNs

Mintong Kang, Linyi Li, Bo Li

IEEE Conference on Secure and Trustworthy Machine Learning (SaTML 2023)

Data, Assemble: Leveraging Multiple Datasets with Heterogeneous and Partial Labels

Mintong Kang, Yongyi Lu, Alan L. Yuille, Zongwei Zhou

IEEE International Symposium on Biomedical Imaging (ISBI 2023)

Certifying Some Distributional Fairness with Subpopulation Decomposition Code

Mintong Kang*, Linyi Li*, Maurice Weber, Yang Liu, Ce Zhang, Bo Li

Advances in Neural Information Processing Systems (NeurIPS 2022 Spotlight)

Fairness in Federated Learning via Core-Stability

Bhaskar Ray Chaudhury, Linyi Li, Mintong Kang, Bo Li, Ruta Mehta

Advances in Neural Information Processing Systems (NeurIPS 2022 Spotlight)

MgSvF: Multi-Grained Slow vs. Fast Framework for Few-Shot Class-Incremental Learning

Hanbin Zhao, Yongjian Fu, Mintong Kang, Qi Tian, Fei Wu, Xi Li

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI 2021)

Service

Reviewer: ICML 2022-2026; NeurIPS 2022-2026; ICLR 2024-2026; TMLR 2026; AISTATS 2025; CVPR 2023-2025.

PC Member: AAAI 2024-2025.

Organizer: KLR@ICML 2023; NeurIPS 2024 Competition on LLM and Agent Safety.

Teaching Assistant: UIUC CS 441: Applied Machine Learning, Spring 2025.