Publication (Chapter)
2025-01-28
The Resource Debate in Machine Translation and Large Language Models
Paolo Caffoni contributes to a reference entry on the language resource debate in LLMs to the Handbuch Soziale Praktiken und Digitale Alltagswelten
Paolo Caffoni contributes a reference entry to the Handbuch Soziale Praktiken und Digitale Alltagswelten:
Abstract
Beginning with recent advancements in Multilingual Machine Translation techniques, this chapter explores the concept of resources in Natural Language Processing and proposes a framework of analysis for so-called ‘low resource languages.’ Departing from popular organicist metaphors of language endangerment and narratives of digital extinction, it attempts to re-contextualize the discrimination between high and low resource languages within a labor theory of linguistic machines and the valorization of linguistic chains. To comprehend today’s Large Language Models, multiple genealogies must be outlined and discussed. This paper focuses on two of these possible genealogies and addresses how the rise of Large Language Models brings forth an old debate: the one between mechanistic and vitalist formulations of language.