Home

The research interests of the DIG team are focused on the computational aspects of data science, machine learning and artificial intelligence. The objectives are to make knowledge easy to extract, especially from textual sources, in order to store, process, query, and be understood by machines.
More specifically, the DIG team’s research activity covers the following topics:
• Database theory,
• Graph mining,
• Machine learning,
• Natural language processing,
• Knowledge bases,
• Machine reasoning,
• Collective intelligence.

The DIG team has strong industrial collaborations:


                  

The DIG team is a proud signer of the TCS4F pledge for sustainable research in theoretical computer science.  A large majority of DIG members are signers of the No free view? No review! pledge in favor of open access:

Theoretical Computer Scientists for Future No free view? No review!

Research

Knowledge Bases

A knowledge base is a computer-processable collection of knowledge about the world. We construct and mine such knowledge bases.

Graph Mining

Graphs are a near-universal way to represent data. We are concerned with mining graphs for patterns and properties. Our particular focus is on the scalability of such approaches.

  • Logo of scikit-networkscikit-network: scikit-network is a Python package for the analysis of large graphs (clustering, embedding, classification, ranking).

Social Web

The Web has evolved more and more into a social Web: content is produced and shared by users. In the DIG team, we follow and anticipate developments in this area.

  • Community detection: We are investigating means to detect and distinguish social communities on the Web.
  • Social Relations: We investigate the optimal investment in social relations from a theoretical point of view.

Language and Relevance

Computer science is not just about computers. In this area of research, we investigate how humans reason, and what this implies for machines.

  • Simplicity Theory: Simplicity theory seeks to explain the relevance of situations or events to human minds. See http://www.simplicitytheory.science
  • Relevance in natural language: The point is to retro-engineer methods to achieve meaningful and relevant speech from our understanding of human performance. Read this paper. Read more on this.
  • Communication as social signalling: We apply game theory and social simulation to explore conditions in which providing valuable (i.e. relevant) information is a profitable strategy. Read this paper. Read more on this.

Machine Learning for Data Streams

We investigate how to do machine learning in real time, contributing to new open source tools:

  • River: a Python library for online Machine Learning
  • MOA: Massive Online Analytics, a framework for mining data streams (in Java)
  • Apache SAMOA: Scalable Advanced Massive Online Analytics, an open source framework for data stream mining on the Hadoop Ecosystem

People

Talel Abdessalem Mehwish Alam Albert Bifet Thomas Bonald
Jean-Louis Dessalles Nils Holzenberger Louis Jachiet Van-Tam Nguyen Fabian Suchanek
 

Faculty

Research engineer

Post-docs

  • Peter Fratrik
  • Fajrian Yunus
  • Alaa Mazouz

PhD candidates

  • Van Chien Nguyen. Advisors: Samuel Tardieu  and Van-Tam Nguyen
  • Le Trung Nguyen. Advisors: Enzo Tartaglione and Van-Tam Nguyen
  • Yinghao Wang. Advisors: Enzo Tartaglione and Van-Tam Nguyen
  • Zhu Liao. Advisors: Enzo Tartaglione and Van-Tam Nguyen
  • Rémi Nahon. Advisors: Enzo Tartaglione and Van-Tam Nguyen
  • Ael Quelennec. Advisors: Enzo Tartaglione, Pavlo Mozharovsky and Van-Tam Nguyen
  • Ali Tarhini. Advisors: Paul Chollet  and Van-Tam Nguyen
  • Long-Tuan Vo. Advisors: Mehwish Alam, Pavlo Mozharovsky and Van-Tam Nguyen
  • Zakari Ait Ouazzou. Advisors: Talel Abdessalem and Albert Bifet
  • François Amat. Advisor: Fabian Suchanek
  • Tom Calamai. Advisors: Fabian M. Suchanek and Oana Balalau
  • Simon Coumes. Advisor: Fabian M. Suchanek
  • Pierre Epron. Advisors: Mehwish Alam and Adrien Coulet
  • Rajaa El HamdaniAdvisosr: Thomas Bonald & Fragkiskos Malliaros
  • Yiwen Peng. Advisors: Thomas Bonald and Mehwish Alam
  • Roman Plaud. Advisors: Thomas Bonald, Mathieu Labeau and Antoine Saillenfest
  • Samuel Reyd. Advisors: Ada Diaconescu and Jean-Louis Dessalles
  • Zacchary Sadeddine. Advisor: Fabian Suchanek
  • Samy Haffoudhi. Advisors: Fabian Suchanek and Nils Holzenberger

Interns

  • Bérénice Jaulmes. Advisors: Mehwish Alam and Fabian Suchanek

Former members

News

An open position of Assistant / Associate Professor is available in the team!

Tuesday, April 29, 2025, 11:45, 4A301

Simon Razniewski (TU Dresden) GPTKB: Comprehensively Materializing Factual LLM Knowledge LLMs have majorly advanced NLP and AI, and next to their ability to perform a wide range of procedural tasks, a major success factor is their internalized factual knowledge. Since (Petroni et al., 2019), analyzing this knowledge has gained attention. However, most approaches investigate one …