Resources

We are committed to promoting open research in the field of knowledge intelligence, sharing our research projects, datasets, benchmarks, and tools with the academic and industrial communities.

Research Projects #

Our research focuses on cutting-edge fields including trustworthy artificial intelligence, intelligent information retrieval, natural language processing, and knowledge intelligence applications, committed to promoting the deep integration of theoretical innovation and practical applications.

AI Security and Trust #

In today’s rapidly evolving AI landscape, ensuring the security and trustworthiness of AI systems is crucial. We conduct in-depth research in this frontier field:

Core Research Areas:

Trustworthy AI Algorithms: Design AI algorithms with robustness, explainability, and fairness
AIGC Security: Research on security risks and protection mechanisms for AI-generated content
AI Model Reliability: Enhance the stability of deep learning models under adversarial attacks
Privacy-Preserving AI: Develop federated learning and differential privacy technologies to protect user privacy

Intelligent Information Retrieval #

Information retrieval is the key technology connecting user needs with massive information. We focus on personalized and intelligent information retrieval system research:

Core Research Areas:

Recommendation Systems: Deep learning-based personalized recommendation algorithms
User Modeling and Understanding: Multi-dimensional user profiling and behavior prediction
Knowledge Acquisition: Automatic acquisition of structured knowledge from multi-source heterogeneous data
Search Technologies: Next-generation intelligent search engine technologies

Natural Language Processing #

Natural language processing is the core technology for achieving intelligent human-machine interaction. We conduct cutting-edge research in large-scale language models and multimodal understanding:

Core Research Areas:

Large Language Models (LLM): Optimization and application of pre-trained models
Multimodal Knowledge Extraction: Knowledge extraction fusing visual and linguistic information
Knowledge Representation Learning: Vectorized representation and reasoning of textual knowledge
Dialogue Systems: Knowledge-driven intelligent dialogue technologies

Spatial-Temporal Analysis and Prediction #

Spatial-temporal data analysis has important application value in intelligent transportation, urban computing, and other fields:

Core Research Areas:

Location-aware Social Network Analysis: Geographic location-based social behavior modeling
Real-time Push Algorithms: Spatial-temporal aware information push systems
Traffic Data Analysis: Data mining and prediction for intelligent transportation systems
Time Series Prediction: Deep learning prediction models for multivariate time series

Knowledge Intelligence Applications #

Apply knowledge intelligence technologies to real industry scenarios, solving complex real-world problems:

Core Research Areas:

Intelligent Finance: Knowledge graph-based financial risk control and investment decision-making
Intelligent Transportation: Traffic flow optimization and intelligent navigation systems
Recommendation System Applications: Industrial applications in e-commerce, content recommendation, and other fields
Multimedia Content Understanding: Intelligent analysis and understanding of videos and images

Research Project Support: The laboratory leads multiple important research projects including National Natural Science Foundation, Guangdong Natural Science Foundation, CCF-DiDi GAIA Young Scholar Program, and others.

Datasets #

Knowledge Graph Datasets #

CN-Academic-KG #

Chinese Academic Knowledge Graph

A large-scale knowledge graph containing over 1 million Chinese academic entities and relationships, covering multiple disciplines including computer science, mathematics, and physics.

GitHub Paper

MultiModal-KG #

Multimodal Knowledge Graph

A multimodal knowledge graph that integrates textual, visual, and numerical information, containing over 500K entities and 2M+ triples.

GitHub Paper

Natural Language Processing Datasets #

FewShot-RE-ZH #

Chinese Few-Shot Relation Extraction

A dataset specifically designed for Chinese few-shot relation extraction, containing 20 relation categories with only 1-5 annotated samples per category.

GitHub Paper

Dialogue-KG #

Dialogue Knowledge Graph

A knowledge graph built from real dialogues, used to evaluate the knowledge reasoning capabilities of dialogue systems.

GitHub Paper

Benchmarks #

KG-Benchmark Suite #

We have developed a comprehensive knowledge graph benchmark suite covering multiple core tasks:

Entity Linking #

Dataset: 5 domains, 100K+ annotated samples
Evaluation Metrics: Precision, Recall, F1-Score
Baseline Models: 10 SOTA model comparisons

Relation Extraction #

Dataset: Bilingual (Chinese-English), 50 relation types
Evaluation Metrics: Micro/Macro F1, AUC
Baseline Models: 15 classic and state-of-the-art models

Knowledge Reasoning #

Dataset: Multi-hop reasoning, complex queries
Evaluation Metrics: MRR, Hits@K
Baseline Models: 8 graph neural network models

GitHub Online Evaluation

Open Source Tools #

KG-Toolkit #

All-in-One Knowledge Graph Toolkit
A Python toolkit that integrates knowledge extraction, reasoning, and visualization capabilities

# Installation
pip install kg-toolkit

# Usage Example
from kg_toolkit import KnowledgeGraph, EntityExtractor

# Create knowledge graph
kg = KnowledgeGraph()

# Entity extraction
extractor = EntityExtractor(model='bert-base-chinese')
entities = extractor.extract("Zhang San is a professor at Sun Yat-sen University")

Key Features:

🚀 Multiple pre-trained model support
📊 Knowledge graph visualization
🔧 Flexible API interface
📚 Detailed documentation and tutorials

GitHub PyPI

NLP-Utils #

Natural Language Processing Utilities
Contains practical functions for text preprocessing, feature extraction, and model evaluation

Main Modules:

Chinese text segmentation and POS tagging
Named entity recognition
Relation extraction
Text classification
Semantic similarity computation

GitHub

Tutorials and Documentation #

Getting Started Tutorials #

Knowledge Graph Fundamentals #

A beginner-friendly knowledge graph tutorial covering theoretical foundations and practical operations. Start Learning

Graph Neural Networks in Practice #

An accessible introduction to the principles and implementation of graph neural networks. Start Learning

NLP Pipeline #

A complete NLP project tutorial from data preprocessing to model deployment. Start Learning

Technical Blog #

Regularly updated technical blog sharing latest research findings and technical insights:

Contribution Guidelines: We welcome community contributions including code, issue reports, and improvement suggestions! Please check the CONTRIBUTING.md file in each project for details.

Contact Us: For technical support or collaboration, please email contact@kilab.org