Home

The research interests of the DIG team span knowledge graphs, LLMs, foundational models, graph mining and data streams, united by a focus on structured knowledge and reasoning. The team develops methods for representing, integrating, and reasoning over complex, dynamic data to enable interpretable and trustworthy AI. Applications range from general-purpose AI to domain-specific areas such as legal AI and AI for health.
More specifically, the DIG team’s research activity covers the following topics:
  • Knowledge bases
  • Logic and algorithms
  • Language and relevance
  • Graph mining
  • Machine learning
  • Data streams
  • LLMs
  • Legal AI
  • AI for health

The DIG team has strong industrial collaborations.

The DIG team is a proud signer of the TCS4F pledge for sustainable research in theoretical computer science.  A large majority of DIG members are signers of the No free view? No review! pledge in favor of open access:

Theoretical Computer Scientists for Future No free view? No review!

Research

Knowledge Bases

A knowledge base is a computer-processable collection of knowledge about the world. We construct and mine such knowledge bases.

Graph Mining

Graphs are a near-universal way to represent data. We are concerned with mining graphs for patterns and properties. Our particular focus is on the scalability of such approaches.

  • Logo of scikit-networkscikit-network: scikit-network is a Python package for the analysis of large graphs (clustering, embedding, classification, ranking).

Data Streams

We investigate how to do machine learning in real time, contributing to new open source tools:

  • River: a Python library for online Machine Learning
  • MOA: Massive Online Analytics, a framework for mining data streams (in Java)
  • Apache SAMOA: Scalable Advanced Massive Online Analytics, an open source framework for data stream mining on the Hadoop Ecosystem

Language and Relevance

Computer science is not just about computers. In this area of research, we investigate how humans reason, and what this implies for machines.

  • Simplicity theory seeks to explain the relevance of situations or events to human minds.
  • Relevance in natural language: The point is to retro-engineer methods to achieve meaningful and relevant speech from our understanding of human performance.
  • We apply game theory and social simulation to explore conditions in which providing valuable (i.e. relevant) information is a profitable strategy. Read this paper.

Team

Talel Abdessalem Mehwish Alam Albert Bifet Thomas Bonald Jean-Louis Dessalles
Nils Holzenberger Louis Jachiet Van-Tam Nguyen Nikola Simidjievski Fabian Suchanek

Faculty

Research engineer

Post-docs

  • Peter Fratrik
  • Fajrian Yunus
  • Alaa Mazouz

PhD candidates

  • Azzedine Ait Said
  • François Amat
  • Tom Calamai
  • Simon Coumes
  • Pierre Epron
  • Lorenzo Guerra
  • Samy Haffoudhi
  • Rajaa El Hamdani
  • Bérénice Jaulmes
  • Zhu Liao
  • Rémi Nahon
  • Hung Nguyen
  • Le Trung Nguyen
  • Van Chien Nguyen
  • Zakari Ait Ouazzou
  • Yiwen Peng
  • Roman Plaud
  • Ael Quelennec
  • Rachida Saroui
  • Samuel Reyd
  • Ali Tarhini
  • Long-Tuan Vo
  • Yinghao Wang

PhD track students

  • Zeinab Ghamlouch
  • Avrile Floro
  • Marc Farah
  • Daniela Cojocaru
  • Quoc-Dat Tran
  • Hai Thien Long Vu
  • Thanh Hai Tran
  • Thanh Nam Tran

News

We are hiring two PhD students and one Postdoc to work on language models and knowledge graphs!

Best paper award at ISWC 2025

Yiwen Peng, Thomas Bonald and Fabian Suchanek received the best paper award at ISWC 2025 for their paper on FLORA: Unsupervised Knowledge Graph Alignment by Fuzzy Logic.

Tuesday, December 2, 2025, 11:45, 4A125

François Amat Mining Expressive Cross-Table Dependencies in Relational Databases This thesis addresses the gap between what relational database schemas declare and the richer set of cross-table rules that actually govern real-world data. It introduces MATILDA, the first deterministic system capable of mining expressive first-order tuple-generating dependencies (FO-TGDs) with multi-atom heads, existential witnesses, and recursion directly …

Tuesday, October 28, 2025, 11:45, 4A125

Cristian Santini (University of Macerata) Entity Linking and Relation Extraction for Historical Italian Texts: Challenges and Potential Solutions Entity Linking and Relation Extraction enable the automatic identification of named entities mentioned in texts, along with their relationships, by connecting them to external knowledge graphs such as Wikidata. While these techniques work well on modern documents, …