Elias Kempf
Foundation models, such as LLMs and vision-language models, are becoming increasingly ubiquitous in industrial applications, as well as in our everyday lives. While the capabilities of these models are continually increasing, our understanding of their inner workings often lags behind. Uncovering how and what these models learn is crucial for both the further improvement and the safe deployment of these AI systems. The goal of my project is to gain actionable insights for enhancing current model architectures and AI safety. For example, I study how sequence information is represented and processed in transformers and state-space models, in comparison to biologically inspired architectures, to identify principles that enhance interpretability and computational efficiency.