Skip to content

Tatjana Scheffler

Ruhr University Bochum

Homepage

Talk: Linguistic analysis with LLMs? The case of discourse and pragmatics

Abstract

Many computational linguistic tasks, even hard ones, can be done well by current LLMs. When is it appropriate to use LLMs for corpus annotation, and which types of linguistic analyses require human intervention? We will address specifically the level of discourse and pragmatics. On the one hand, discourse/pragmatics analyses have an image of being "less strict" and more subjective than other linguistic questions. Thus, many LLM annotations and responses may at first glance seem to be superficially appropriate. On the other hand, discourse phenomena are typically rare even in large corpora (because they manifest over large text chunks) and pragmatic phenomena exist at the intersection of language and society (i.e., language as a communicative system, not just a grammatical system) and thus often depend crucially on human interaction, intentions, and other non-linguistic aspects which are not transparent to language models (the grounding problem). Due to these factors, LLM responses are still frequently un-human-like in the discourse/pragmatics domain. Using examples from recent research into the detection and interpretation of figurative language, the semantics and pragmatics of emojis, and detecting hate and candy speech, we will discuss discourse and pragmatic abilities of LLMs and under which circumstances LLMs can be used to aid linguistic analyses.

Bio

Tatjana Scheffler is professor for digital forensic linguistics at Ruhr University Bochum. After completing her PhD in formal semantics and pragmatics at the University of Pennsylvania, USA, she became a researcher in multimodal human computer interfaces at the German Research Center for Artificial Intelligence, before returning to academia at the universities of Potsdam and Konstanz, Germany. She joined Ruhr University Bochum in 2020. Her research encompasses corpus and computational linguistic approaches, applied primarily to phenomena in pragmatics and discourse in social media data. She works on topics such as metaphors, disinformation, and the semantics of emojis.