Seminar

The DIG seminar takes place on a regular basis with both invited speakers and speakers from within the DIG team. Seminars from before September 2023 can be found here. This calendar contains future (and some past) seminars.

  • Tuesday, October 8, 2024, 11:45, 4A125

    Chaitanya Manapragada

    Automated exploration of algorithm design space

  • Tuesday, September 24, 2024, 11:45, 4A125

    Ambroise Odonnat

    Leveraging Ensemble Diversity for Robust Self-Training in the Presence of Sample Selection Bias

    Self-training is a well-known approach for semi-supervised learning. It consists of iteratively assigning pseudo-labels to unlabeled data for which the model is confident and treating them as labeled examples. For neural networks, softmax prediction probabilities are often used as a confidence measure, although they are known to be overconfident, even for wrong predictions. This phenomenon is particularly intensified in the presence of sample selection bias, i.e., when data labeling is subject to some constraints. To address this issue, we propose a novel confidence measure, called T-similarity, built upon the prediction diversity of an ensemble of linear classifiers. We provide the theoretical analysis of our approach by studying stationary points and describing the relationship between the diversity of the individual members and their performance. We empirically demonstrate the benefit of our confidence measure for three different pseudo-labeling policies on classification datasets of various data modalities.

  • Tuesday, September 10, 2024, 11:45, 4A125

    Samuel Reyd & Jean-Louis Dessalles

    Title TBD

  • Tuesday, July 9, 2024, 11:45, 4A125

    Peter Fratrič

    Mining behavior from a legal simulation environment: where we are and what lies ahead

    This talk presents a methodological framework for the use of simulation-based methods to investigate questions of non-compliance in a legal context. Its aim is to generate observed or previously unobserved instances of non-compliance and use them to improve compliance and trust in a given socio-economic infrastructure. The framework consists of three components: a law formalization process resulting in a normative system implemented as an agent-based model, a profit-driven agent generating instances of non-compliance, and a norm extraction process transforming the generated behavior into a formal model. Early research results of practical implementation of this methodology are illustrated on a multinational tax avoidance case. Towards the end, we focus on open issues related to behavior clustering and data/process mining.

  • Tuesday, July 2, 2024, 12:15, 4A301

    Chadi Helwe

    PhD defense practice talk

    This thesis focuses on evaluating and improving the reasoning abilities of Smaller Language Models (SLMs) and Large Language Models (LLMs). It explores SLMs’ performance on complex tasks and their limitations with simpler ones. This thesis introduces LogiTorch, a Python library that facilitates the training of models on various reasoning tasks with minimal coding. It also presents TINA, a negated data augmentation technique that improves SLMs’  robustness to negation in textual entailment tasks. Further, this thesis explores LLMs’ capabilities through MAFALDA, a new benchmark for identifying and classifying reasoning fallacies, proposing a new annotation scheme and evaluation metric that considers subjectivity in reasoning. The findings indicate that humans outperform SLMs and LLMs in this reasoning task. We propose several research directions that merit further investigation, such as investigating Neuro-symbolic AI and improving the reasoning abilities of low-resource LLMs.