Skip to main content

Welcome to the Knowledge Intelligence Lab (KILab) !

Our lab is directed by Dr. Ziyu Lyu. We are dedicated to advancing cutting-edge research in Knowledge Intelligence. We are specified in Secure and Trusted AI, Intelligent Information Retrieval, Natural Language Processing. Recently, we are focusing on AIGC Safety, Social Media Content Safety and Reliable Recommender System.

We are actively recruiting Postdocs, PhD students, Master students,and research interns. Welcome to Join us!

[2026.4] Congrats to Jin Zeng! Our work, “RAIE: Region-Aware Incremental Preference Editing with LoRA for LLM-based Recommendation”, has been accepted by WWW 2026.
[2026.4] Congrats to Pengyu Qi! Our work, “Please Refuse to Answer Me! Mitigating Over-Refusal in LLMs via Adaptive Contrastive Decoding”, has been accepted by ACL 2026 Main Conference.
[2025.8] Congrats to Pengyu Qi! Our work, “MidPO: Dual Preference Optimization for Safety and Helpfulness in LLMs via MoE Framework”, has been accepted by EMNLP FINDING 2025.
[2025.7] Congrats to Renyang Liu! Our collaborative work, “D-Judge: How Far Are We? Accessing the Discrepancies Between AI-synthesized Images and Natural Images through Multimodal Guidance”, has been accepted by ACM MM 2025.

Recent

Please Refuse to Answer Me! Mitigating Over-Refusal in LLMs via Adaptive Contrastive Decoding

Safety-aligned LLMs frequently generate refusal responses to harmless queries due to superficial lexical similarity with malicious ones — a phenomenon known as over-refusal. Existing approaches either reduce over-refusals or preserve safety, but rarely achieve both simultaneously. We propose AdaCD, a training-free and model-agnostic adaptive contrastive decoding method that dynamically adjusts the refusal token distribution to mitigate over-refusal while maintaining or even enhancing model safety.

RAIE: Region-Aware Incremental Preference Editing with LoRA for LLM-based Recommendation

We study the problem of user preference drift in LLM-based recommendation and propose RAIE, a region-aware incremental editing framework. Instead of global updates or instance-level edits, RAIE introduces preference regions as structured units for localized adaptation. This design enables efficient continual learning while preserving stable preferences.