Tuesday, April 8, 2025, 11:45, 4A125

Pratik Karmakar

ProvSQL: Provenance and Probabilistic Querying in Uncertain Databases

Probabilistic databases provide a powerful framework for managing and querying uncertain data, enabling principled reasoning under uncertainty. ProvSQL extends PostgreSQL to support provenance tracking and probability computation in probabilistic databases, leveraging provenance circuits to efficiently compute probabilities and Shapley-based data valuations. In this talk, we introduce ProvSQL, demonstrate its capabilities, and explore a key use case—content based image retrieval from the COCO dataset. We show how probabilistic query evaluation and data valuation techniques enhance explainability and trust in AI-driven decision-making.

Tuesday, March 25, 2025, 11:45, 4A301

Gaël Varoquaux (INRIA)

Tabular foundation models: priors for numbers and strings

Deep-learning typically does not outperform tree-based models on tabular data. Often this may be explained by the small size of such datasets. For images, sound, text, the solution has be pretrained models, leading to foundation models, adapted and reused for many tasks. I will discuss the challenges to bring these ideas to tabular learning, and the progress that we have made, building priors for tables, ie columns of different natures, with numbers and strings.

Tuesday, March 18, 2025, 11:45, 4A301

Pierre Monnin (INRIA)

Neuro-symbolic approaches for the knowledge graph lifecycle

In the Web of Data, an increasing number of knowledge graphs (KGs) are concurrently published, edited, and accessed by human and software agents. Their wide adoption makes essential the tasks of their lifecycle: construction, refinement (e.g., matching, link prediction), mining, and usage to support applications (e.g., explainable AI, recommender systems). However, all these tasks require facing the inherent heterogeneity of KGs, e.g., in terms of granularities, vocabularies, and completeness. Besides, scalability issues arise due to their increasing size and combinatorial nature. In my talk, I will present my research on neuro-symbolic approaches for the KG lifecycle, intertwining domain knowledge from ontologies, deductive reasoning, analogical reasoning, and machine learning models. Throughout my presentation, I will show that such approaches enhance models by improving their semantic awareness, frugality, and the semantic interpretability of their latent representation space.

Tuesday, March 4, 2025, 11:45, 4A301

Ken Satoh (National Institute of Informatics, Japan)

Translating German traffic cases into logical rules

This is a joint work with May Myo Zin at my center and Georg Borgess at University of Saarland. In this talk, I will report the work on extracting normative sentences from German traffic cases and translating them into logical rules. The development of autonomous vehicles (AVs) requires a comprehensive understanding of both explicit and implicit traffic rules to ensure legal compliance and safety. While explicit traffic laws are well-defined in statutes and regulations, implicit rules derived from judicial interpretations and case law are more nuanced and challenging to extract. This research firstly investigates the potential of Large Language Models (LLMs), particularly GPT-4o, in automating the extraction of implicit traffic normative sentences from judicial decisions. Then we investigate how to translate these normative sentences into a logical form. We explore to use large language models (LLMs) to automate the translation of traffic rules into PROLOG, a declarative programming language ideal for encoding logical rules and relationships. The proposed methodology consists of three key phases: extracting traffic rules from diverse textual sources, structuring them into Logical English (LE) for clarity and consistency, and translating them into PROLOG representations using advanced natural language processing (NLP) techniques, including in-context learning and fine-tuning. The experimental results demonstrate the effectiveness of LLMs in automating this process, achieving high accuracy in translation.

Tuesday, February 4, 2025, 11:45, 4A125

Fabian Suchanek

YAGO

In this talk I will present the newest version of YAGO, the knowledge base that we are building with several members of the DIG team. I will show why we build it, how we build it, and how it can be used. This will also be an occasion for me to get your feedback on our work.

Tuesday, January 21, 2025, 11:45, 4A301

Simon Delarue

Learning on graphs: from algorithms to socio-technical analyses on AI

This thesis addresses the dual challenge of advancing Artificial Intelligence (AI) methods while critically assessing their societal impact. With AI technologies now embedded in high-stake decision sectors like healthcare and justice, their growing influence demands thorough examination, reflected in emerging international regulations such as the AI Act in Europe. To address these challenges, this work leverages attributed-graph based methods and advocates for a shift from performance-focused AI models to approaches that also prioritise scalability, simplicity, and explainability.

The first part of this thesis develops a toolkit of attributed graph-based methods and algorithms aimed at enhancing AI learning techniques. It includes a software contribution that leverages the sparsity of complex networks to reduce computational costs. Additionally, it introduces non-neural graph models for node classification and link predictions tasks, showing how these methods can outperform advanced neural networks while being more computationally efficient. Lastly, it presents a novel pattern mining algorithm that generates concise, human-readable summaries of large networks. Together, these contributions highlight the potential of these approaches to provide efficient and interpretable solutions to AI’s technical challenges.

The second part adopts an interdisciplinary approach to study AI as a socio-technical system. By framing AI as an ecosystem influenced by various stakeholders and societal concerns, it uses graph-based models to analyse interactions and tensions related to explainability, ethics, and environmental impact. A user study explores the influence of graph-based explanations on user perceptions of AI recommendations, while the building and analysis of a corpus of AI ethics charters and manifestos quantifies the roles of key actors in AI governance. A final study reveals that environmental concerns in AI are primarily framed technically, highlighting the need for a broader approach to the ecological implications of digitalisation.

Tuesday, December 10, 2024, 11:45, 4A125

Lanfang Kong

Explainable algorithms for anomaly detection and time series forecasting

Artificial intelligence has shown dominant performance across diverse domains, including critical ones such as medicine, finance, justice and so on. As a result, the explainability of black-box models is becoming more and more important. We focus on two specific applications: anomaly detection and time series forecasting, and present XTREK and ADAPATCH, respectively.

XTREK is an unsupervised tree-based approach for explainable anomaly detection, which maximizes Kendall’s tau between the anomaly scores of the source anomaly detector and those of XTREK. The tree produced by our algorithm is relatively small in size, thereby boasting the renowned off-the-shelf transparency and explainability of tree-based approaches. Moreover, its explanations are sample-based. In particular, the anomaly scores are computed to be the inverse of the size of the corresponding leaf, thereby providing meaningful explanations when comparing examples with different anomaly scores. XTREK can also be used as an in-model approach, which is capable of providing concise explanations for its own decisions. Moreover, we propose efficient computation of Kendall’s tau coefficients when determining the best split at each node of the regression tree. We show how this can be computed incrementally, thereby making the running time of our algorithm almost linear (up to a logarithmic factor) in the size of the input.

ADAPATCH is an adaptive patch-based saliency map method for explainable time series forecasting, which provides local, post-hoc visualization explanations. The approach highlights those patches which would result in worse predictions when hidden to the black-box algorithm. With a differential encoding module in the mask of input, the optimization can be done by gradient-based perturbation. ADAPATCH does not need the patch parameters upfront, such as the length or the stride, as all patch-based approaches need. In fact, it learns those parameters from the data, thereby effectively adapting to different settings and application scenarios. By enforcing an upper bound on the maximum number of patches, we make sure that the patch-level explanations provided by our algorithm can be easily interpreted by humans, as opposed to explanations consisting of a large number of single time points. Moreover, ADAPATCH requires a much smaller number of parameters, typically linear in the number of patches as opposed to linear in the number of time steps. This makes our approach more efficient and easy to train.

Both methods are model-agnostic, which means the architecture of the black-box model can be hidden from the users. They provide accurate and simple explanations, as validated by extensive experiments.

Tuesday, December 3, 2024, 11:45, 4A125

Gabriel Damay

Dynamic Decision Trees and Community-based Graph Embeddings: towards Interpretable Machine Learning

Machine learning is the field of computer science that interests in building models and solutions from data without knowing exactly the set of instructions internal to these models and solutions. This field has achieved great results but is now under scrutiny for the inability to understand or audit its models among other concerns. Interpretable Machine Learning addresses these concerns by building models that are inherently interpretable. This thesis contributes to Interpretable Machine Learning in two ways.

First, we study decision trees. This is a very popular group of machine learning methods for classification problems and it is interpretable by design. However, real world data is often dynamic, but few algorithms can maintain a decision tree when data can be both inserted and deleted from the training set. We propose a new algorithm called FuDyADT to solve this problem.

Second, when data are represented as graphs, a very common machine learning technique called “embedding” consists in projecting them onto a vectorial space. This kind of method however is usually not interpretable. We propose a new embedding algorithm called PaRFaITe based on the factorization of the Personalized PageRank matrix. This algorithm is designed to provide interpretable results.

We study both algorithms theoretically and experimentally. We show that FuDyADT is at least comparable to state-of-the-art algorithms in the usual setting, while also being able to handle unusual settings such as deletions of data. PaRFaITe on the other hand produces embedding dimensions that align with the communities of the graph, making the embedding interpretable.

Tuesday, November 12, 2024, 11:45, 4A125

Cyril Chhun

Methodology and Meta-Evaluation Benchmark for Automatic Story Generation

Storytelling is a central component of human culture. Multiple approaches have been proposed to explore computational storytelling, despite the inherent challenges posed by the tasks of generating stories and assessing their quality. In this thesis, we design a meta-evaluation methodology and benchmark for ASG. First, we lay the groundwork for conducting our meta-evaluation: we describe our chosen setting, provide definitions for the ASG and Automatic Story Evaluation (ASE) tasks, and propose an original set of six criteria for story evaluation. Then, we introduce HANNA, our corpus of Human ANnotated NArratives, which contains 1,056 stories annotated w.r.t. our six criteria, and show that those criteria allow for a standardized human evaluation. We use Large Language Models (LLMs) to augment HANNA with 480 new stories and 150k+ rating annotations. We observe that LLMs obtain better grades than humans, as rated by selected LLMs. After that, we perform our meta-evaluation benchmark on HANNA. We mainly observe that specific measures for ASE are needed, and that commonly-used measures (e.g. BLEU) are sub-optimal. We then show our analysis of LLM performance at ASE: we find that LLMs are currently the best proxy for human evaluation of ASG and that, in our specific setting, providing detailed guidelines does not improve correlations between LLM and human ratings. Those results prompt us to study whether the performance displayed by LLMs at ASE and ASG can be explained through different factors. We perform a three-part study on LLM-generated explanations, and an analysis of pretraining data on LLM performance. Notably, we find that LLMs struggle to explain their answers with substantiated claims. Finally, we outline three main research perspectives: designing specific ASE measures, further investigating LLM performance at ASG and ASE, and assessing and mitigating the impact of LLMs on society.

References:

Of Human Criteria and Automatic Metrics: A Benchmark of the Evaluation of Story Generation (COLING 2022)

Do Language Models Enjoy Their Own Stories? Prompting Large Language Models for Automatic Story Evaluation (TACL 2024)