Skip to content

Sam Boeve

University of Ghent

Homepage

Title: From Tool to Theory: LLMs in Psycholinguistics

Abstract

Large language models (LLMs) have rapidly become part of the psycholinguistic toolkit. Trained on vast text corpora with the objective of predicting the next word, they provide researchers with powerful resources for studying language processing in new ways. LLMs can quantify word predictability, which can be used to model reading behaviour. They also provide high-dimensional contextual embeddings that facilitate the localization and decoding of brain activity, and can even serve as pseudo-participants to estimate word features like word familiarity and age-of-acquisition. These applications complement more traditional experimental and corpus-based methods, offering fine-grained insights that were previously out of reach. Yet with these new possibilities come important challenges. LLM-based analyses are sensitive to model choice and evaluation methods, raising questions about their reliability and what aspects of language processing they capture well, and where they diverge from human cognition. In this talk, I will survey the opportunities and pitfalls of applying LLMs in psycholinguistics, with special attention to their use in predictability research. I will present recent work on Dutch language models (Boeve & Bogaerts, 2025), illustrating how LLM-based predictability measures can both replicate and extend traditional psycholinguistic findings to new languages. The talk will conclude with a hands-on demonstration to highlight the accessibility of LLMs and the ease with which they can be integrated in ongoing research.

Bio

Sam Boeve is a PhD researcher at Ghent University's Department of Experimental Psychology, where he is a member of the BogaertsLab. His research focuses on the role of word predictability in reading, particularly in children and atypical readers such as individuals with dyslexia, and second language learners. He uses computational methods, mainly large language models, to investigate how readers anticipate upcoming words. By combining large language models with the study of human reading he aims to advance our understanding of reading processes and their implications for language development and disorders.