Tuesday, September 26, 2023, 11:45, 4A101

Nedeljko Radulovic

Post-hoc Explainable AI for Black Box Models on Tabular Data

Current state-of-the-art Artificial Intelligence (AI) models have been proven to be very successful in solving various tasks, such as classification, regression, Natural Language Processing (NLP), and image processing. The resources that we have at our hands today allow us to train very complex AI models to solve problems in almost any field: medicine, finance, justice, transportation, forecast, etc. With the popularity and widespread use of the AI models, the need to ensure the trust in them also grew. Complex as they come today, these AI models are impossible to be interpreted and understood by humans. In this thesis, we focus on the specific area of research, namely Explainable Artificial Intelligence (xAI), that aims to provide the approaches to interpret the complex AI models and explain their decisions. We present two approaches STACI and BELLA which focus on classification and regression tasks, respectively, for tabular data.

Both methods are deterministic model-agnostic post-hoc approaches, which means that they can be applied to any black-box model after its creation. In this way, interpretability presents an added value without the need to compromise on black-box model’s performance. Our methods provide accurate, simple and general interpretations of both the whole black-box model and its individual predictions. We confirmed their high performance through extensive experiments and a user study.

Tuesday, September 19, 2023, 11:45, 4A301

Julien Lie-Panis

Models of reputation-based cooperation. Bridging the Gap between Reciprocity and Signaling.

Human cooperation is often understood through the lens of reciprocity. In classic models, cooperation is sustained because it is reciprocal: individuals who bear costs to help others can then expect to be helped in return. Another framework is honest signal theory. According to this approach, cooperation can be sustained when helpers reveal information about themselves, which in turn affects receivers’ behavior. Here, we aim to bridge the gap between these two approaches, in order to better characterize human cooperation. We show how integrating both approaches can help explain the variability of human cooperation, its extent, and its limits.

In chapter 1, we introduce evolutionary game theory, and its application to human behavior.

In chapter 2, we show that cooperation with strangers can be understood as a signal of time preferences. In equilibrium, patient individuals cooperate more often, and individuals who reveal higher preference for the future inspire more trust. We show how our model can help explain the variability of cooperation and trust.

In chapter 3, we turn to the psychology of revenge. Revenge is often understood in terms of enforcing cooperation, or equivalently, deterring transgressions: vengeful individuals pay costs, which may be offset by the benefit of a vengeful reputation. Yet, revenge does not always seem designed for optimal deterrence. Our model reconciles the deterrent function of revenge with its apparent quirks, such as our propensity to overreact to minuscule transgressions, and to forgive dangerous behavior based on a lucky positive outcome.

In chapter 4, we turn to dysfunctional forms of cooperation and signaling. We posit that outrage can sometimes act as a second-order signal, demonstrating investment in another, first-order signal. We then show how outrage can lead to dishonest displays of commitment, and escalating costs.

In chapter 5, we extend the model in chapter 2 to include institutions. Institutions are often invoked as solutions to hard cooperation problems: they stabilize cooperation in contexts where reputation is insufficient. Yet, institutions are at the mercy of the very problem they are designed to solve. People must devote time and resources to create new rules and compensate institutional operatives. We show that institutions for hard cooperation problems can emerge nonetheless, as long as they rest on an easy cooperation problem. Our model shows how designing efficient institutions can allow humans to extend the scale of cooperation.

Finally, in chapter 6, we discuss the merits of mathematical modeling in the social sciences.