KILab🚀

Welcome to the Knowledge Intelligence Lab (KILab) !

Our lab is directed by Dr. Ziyu Lyu. We are dedicated to advancing cutting-edge research in Knowledge Intelligence. We are specified in Secure and Trusted AI, Intelligent Information Retrieval, Natural Language Processing. Recently, we are focusing on AIGC Safety, Social Media Content Safety and Reliable Recommender System.

We are actively recruiting Postdocs, PhD students, Master students,and research interns. Welcome to Join us!

Featured Projects #

🛡️ AI Safety

AdaCD

Adaptive Contrastive Decoding to mitigate over-refusal while preserving LLM safety

📄 Post 💻 GitHub

🔬 Computer Vision

D-Judge

A Large-Scale Dataset for Visual Research on AI-Synthesized and Natural Images

📄 arXiv 🤗 Dataset 💻 GitHub

⚡ AI Safety

MidPO

Dual Preference Optimization for Safety and Helpfulness in LLMs via MoE Framework

📄 arXiv 💻 Code (Soon)

⚖️ Fairness

FairWork

Fairness-aware Work Allocation and Assessment Demo

🎮 Demo

[2026.4] Congrats to Jin Zeng! Our work, “RAIE: Region-Aware Incremental Preference Editing with LoRA for LLM-based Recommendation”, has been accepted by WWW 2026.

[2026.4] Congrats to Pengyu Qi! Our work, “Please Refuse to Answer Me! Mitigating Over-Refusal in LLMs via Adaptive Contrastive Decoding”, has been accepted by ACL 2026 Main Conference.

[2025.8] Congrats to Pengyu Qi! Our work, “MidPO: Dual Preference Optimization for Safety and Helpfulness in LLMs via MoE Framework”, has been accepted by EMNLP FINDING 2025.

[2025.7] Congrats to Renyang Liu! Our collaborative work, “D-Judge: How Far Are We? Accessing the Discrepancies Between AI-synthesized Images and Natural Images through Multimodal Guidance”, has been accepted by ACM MM 2025.

Recent

Please Refuse to Answer Me! Mitigating Over-Refusal in LLMs via Adaptive Contrastive Decoding

April 7 2026

Safety-aligned LLMs frequently generate refusal responses to harmless queries due to superficial lexical similarity with malicious ones — a phenomenon known as over-refusal. Existing approaches either reduce over-refusals or preserve safety, but rarely achieve both simultaneously. We propose AdaCD, a training-free and model-agnostic adaptive contrastive decoding method that dynamically adjusts the refusal token distribution to mitigate over-refusal while maintaining or even enhancing model safety.

RAIE: Region-Aware Incremental Preference Editing with LoRA for LLM-based Recommendation

April 7 2026

We study the problem of user preference drift in LLM-based recommendation and propose RAIE, a region-aware incremental editing framework. Instead of global updates or instance-level edits, RAIE introduces preference regions as structured units for localized adaptation. This design enables efficient continual learning while preserving stable preferences.

MidPO: Dual Preference Optimization for Safety and Helpfulness in Large Language Models via a Mixture of Experts Framework

August 21 2025

Large Language Models (LLMs) need to be both helpful and safe, but achieving both is a major challenge. We propose MidPO, a Mixture of Experts (MoE) framework that fine-tunes two specialized experts for safety and helpfulness and uses a dynamic routing mechanism to adaptively balance them, outperforming existing methods.

FairWork: A Generic Framework For Evaluating Fairness In LLM-Based Job Recommender System

July 20 2025

Dual-perspective, dual-granularity fairness evaluation for LLM-based job recommendation: we assess bias from both the user and the recruiter sides, at individual and group levels.

D-Judge: How Far Are We? Accessing the Discrepancies Between AI-synthesized Images and Natural Images through Multimodal Guidance

July 1 2025

AI-generated images are more visually stunning than ever, but how do they really stack up against natural, real-world photos? We introduce D-Judge, a large-scale benchmark designed to systematically investigate and quantify the discrepancies that remain.