Research Fellow in Machine Cognition at UCL

UCL's Faculty of Brain Sciences is recruiting a Research Fellow in Machine Cognition to join Mario Giulianelli's research group in the Department of Linguistics, within the Division of Psychology and Language Sciences. The group studies information processing in human and artificial cognitive systems, with a particular focus on the evaluation, interpretability and cognitive modelling of language models, including questions related to artificial agency and AI safety. The group is part of the UCL ELLIS Unit and is connected to wider research communities across the UCL AI Centre and UCL Computer Science.

The role focuses on how language model agents and other artificial neural network foundation models represent beliefs and goals. The project will investigate whether such systems maintain representations of uncertainty over environment states, whether they encode preferences over possible states, and how these internal representations can be reliably extracted, evaluated and manipulated. The broader aim is to develop methods that support high-confidence claims about the goals pursued and beliefs held by artificial agents.

The postholder will carry out original research combining interpretability methods, behavioural and agent based evaluation, probabilistic modelling, and ideas from cognitive science, neuroscience and theories of learning. Relevant methods may include probing, causal mediation analysis, activation verbalisation, behavioural evaluations of goal directed behaviour, and computational modelling of the relationship between beliefs, goals, action selection and observed behaviour.

Applicants should have a PhD, or equivalent experience, in a relevant area such as computational linguistics, artificial intelligence, cognitive science, neuroscience, computer science, psychology or a related discipline. The role would suit someone with experience designing rigorous computational experiments, a strong understanding of transformer language models, knowledge of interpretability techniques and probabilistic or computational modelling methods, and good Python programming skills. Experience with language model evaluation, cognitive modelling, reinforcement learning, neuroscience of goal directed behaviour, learning theory, or large scale GPU or cloud workflows would be beneficial.

Potential applicants should consult the full job description for further details and apply through the UCL jobs page.