The Predictive Turn of Language. Reading the Future in the Vector Space

Paolo Caffoni presents on the long historical arc of the predictive turn in a panel chaired by Matteo Pasquinelli

Ca' Foscari University of Venice

Chat, Token, Vector. Questioning Models of Language and Neo-Structuralism in AI

Pedagogy of Machines LLMs Multidimensional space Predictive turn Linguistic turn Prediction Translation Machine translation Transformer architecture Cartography

Presentation abstract

Part of the panel Cartographies of the Vector Space, chaired by Matteo Pasquinelli

In 2016, Google Translate implemented a single encoder-decoder architecture capable of supporting over 100 languages through shared parameters (Johnson et al. 2017). This shift enabled what is now known as “zero-shot translation” – the model’s ability to translate between previously unseen language pairs without using pivot languages or fine-tuning. Researchers at Google observed that semantically similar sentences across different languages tended to cluster within the same region of vector space, suggesting the emergence of a shared semantic representation – an “interlingua.”
What is the value of prediction in language? While Foucauldian frameworks have attempted to conceptualize prediction within the rationalities of governance – e.g., “zero- shot politics” (Amoore, 2024) – such interpretations often foreground institutional control while sidelining the underlying infrastructural and material transformations. Reviewing the linguistic and philosophical work of the Cambridge Language Research Unit (1955–1970), Lydia H. Liu notes that the theoretical innovation introduced by the “machine interlingua” (Richens, 1956) lay precisely in its rejection of isolated, monolingual mental spaces that had dominated AI discourse (Liu, 2023). Although it may seem anachronistic to compare the social, economic, and intellectual contexts of 17th-century universal language projects (from Descartes to Wilkins) with mid-20th-century machine translation experiments, both eras – as Jacqueline Léon (2002) observes – placed intermediary and universal languages at the forefront of scientific agendas. Leibniz approached the task of inferring the unknown not as a problem of machine ontology – what we might today call computation, neural networks, or AI – but as a question of language epistemology. He coined the term cogitatio caeca (“blind thought”) to describe the capacity to perform calculations using words or symbols whose meanings might not be fully grasped (Leibniz, 1685). Today, Bender and Gebru might dismiss cogitatio caeca as yet another “stochastic parrot” (Bender et al., 2021). Yet it was precisely this symbolic manipulation of signs to infer the unknown that interested Leibniz. In Shannon’s information theory – an epistemic and political counterpoint to the literary theorizations of Russian Formalism – prediction is linked to uncertainty: the more predictable a message, the less information it conveys (Shannon, 1951). His measures of entropy and redundancy sought to quantify uncertainty, estimating how much could be anticipated within a communication system.
Numerous examples suggest that the “predictive turn” in language cannot be attributed to a single historical moment or technological rupture. Rather than viewing prediction as a byproduct of epistemic truth-seeking, as in Leibniz’s philosophical language, we might understand it as the result of countless micro-abstractions enacted within everyday linguistic practices. This reconceptualization invites a critical inquiry into prediction’s relationship with labor: How does linguistic production in vector space connect with broader socio-economic structures of work? In this light, prediction becomes a mode of command over the future, operationalized through the accumulation of abstracted linguistic traces across digital platforms.