I am a research group lead at the Max Planck Institute for Intelligent Systems in Tübingen. My research focuses on the role of society in the study of computation, taking into account actions and reactions of individuals when analyzing and designing algorithmic systems. Prior to joining MPI I spent two years as an SNSF postdoctoral fellow at UC Berkeley hosted by Moritz Hardt. I have obtained my PhD from ETH Zurich where I was affiliated with the Data Analytics Laboratory and supervised by Thomas Hofmann. During my PhD I was employed at IBM Research Zurich where I contributed to the design and implementation of system-aware learning algorithms that today form the backbone of the IBM Snap ML library.
When informing consequential decisions, predictions have the potential to change the way the broader system behaves by triggering actions and reactions of individuals. Thereby they alter the data distribution the predictive model has been trained on -- a dynamic effect that traditional machine learning fails to account for. To better understand and address this phenomenon, we introduce the framework of performative prediction to supervised learning [ICML'20]. We analyze the dynamics of retraining strategies in this setup and address challenges faced in stochastic optimization when the deployment of a model triggers performative effects in the data distribution it is being trained on [NeurIPS'20]. When performative effects are strong we would wish to model and understand these effects in order to incorporate them into the very design of learning systems. Towards this ambitious goal we explore connections to microfoundations from macroeconomics theory and investigate how assumptions on individual behavior can be used to model and analyze performative effects in the context of strategic classification [ICML'21]. We study performative prediction through the lense of causal inference [NeurIPS'22], and we build on connection to online learning to design targeted experimentation and exploration strategies to collect data and find good models ex-post [ICML'22]. Only having scratched the surface there is so much more to understand on how algorithms and society interact. I am excited to dive into these research questions by building on tools from optimization, causality, control theory, as well as learning from experts in economics and sociology.
The extent to which predictions are performative is closely related to the economic concept of power -- the more powerful the firm making the prediction the stronger the performative effects. As such, power plays a suddle role in learning. It offers a lever to achieve low risk for the firm making the predictions by exerting influence on the population without necessarily fitting existing patterns in data. To formally study the problem of power in prediction we introduce the notion of performative power[NeurIPS'22a], quantifying the extent to which a firm can induce change in a population of participants. We relate performative power to the economic study of competition in digital economies, analyze its role in optimization and propose an observational causal design to estimate performative power by exploiting discrete decisions of how firms display predictions, by building on properties of predictions more generally [NeurIPS'22b], and by building on connections to control theory to exploit repeated interactions between firms and individuals [CDS@NeurIPS'22]. In our most recent project we investigate algorithmic collective action as a means to counter power imbalances in digital economies. Most digital firms rely to some extent on data provided by individuals, offering a lever to the population for gaining back some power over the system by strategically reporting data [forthcoming'23]. We study different learning settings and quantify the critical mass of individuals that need to be mobilized to achieve concrete goals. The critical mass is closely related to the cost of mobilizing a collective. In the future it would be interesting to study the effectiveness of concrete strategies and challenges of coordination, as well as connection to labor markets and collective action theory in political economy.
When training machine learning models in production, speed and efficiency are critical factors. Fast training times allow short development cycles, offer fast time-to-insight, and after all, save valuable resources. Our approach to achieving fast training is to enable the efficient use of modern hardware through novel algorithm design. In particular, we develop principled tools and methods for training machine learning models focusing on: compute parallelism [NeurIPS'19][ICML'20], hierarchical memory structures [HiPC'19][NeurIPS'17], accelerator units [FGCS'17] and interconnect bandwidth in distributed systems [ICML'18]. We demonstrated [NeurIPS'18] that such an approach can lead to several orders of magnitude reduction in training time compared to standard system-agnostic methods. The core innovations of this research have been integrated in the IBM Snap ML library and help diverse companies improve speed, efficiency and scalability of their machine learning workloads.