Linguistic Epidemiology – Part 1, Units of analysis

In his insightful post ‘Is language a replicator?’ (June 1, 2009), Nicolas Claidière usefully critiques a recent review article by Mark Pagel on evolutionary approaches to language change (Nature Reviews Genetics Vol. 10, June 2009). Pagel’s paper (and Nicolas’s critique) raises a range of issues, but here I only want to emphasise a really important point that Nicolas makes, namely that Pagel – and pretty much everyone involved in the kind of work he reviews, I might add – is often vague or ambiguous as to the unit of analysis in language change. Are we talking about the historical evolution of elements of languages such as words? Or whole languages at the historical community level? Or languages as integrated systems in individuals’ minds? I recently addressed this issue in an article ‘Transmission Biases in Linguistic Epidemiology’ in the online Journal of Language Contact (THEMA 2 2008:299-310; freely accessible at: The problem Nicolas identifies is laid out in section 3 of the paper, as follows (feel free to replace the term ‘variant’ with element, item, character, or equivalent, as you prefer):

The units of transmission: variants, not languages

There is no type of single event through which ‘a language’ as an entire structured system is socially transmitted. It is only through exposure to fragments of language, one chunk at a time, that we are able to build descriptions of whole language systems, either in learning languages (e.g., as children or as second-language learners) or in documenting them (e.g., as grammarians).



Causal processes in the dynamic circulation of language are at the level of utterances and linguistic items (Nettle 1999, Croft 2000), not at the level of languages. As has often been pointed out—nowhere more eloquently and forcefully than by Le Page & Tabouret-Keller (1985)—the notion of a language is essentially an ethnic, ideological, and political one. To understand the distribution of linguistic structure at a population level, we are primarily concerned with the spatio-temporal distribution of individual elements of a language system. Any notion that ‘languages’ are distributed in populations, while true in certain senses (see below), is secondary to the distribution of individual linguistic variants.



Let me clarify a few points now.
First, to speak of linguistic variants as ‘things’ is a convenient fiction. If we speak of the distribution in a population of a word or other linguistic form, we are in fact referring to the distribution of a communicative, collaborative practice of employing, and responding to, a word or linguistic form.
Second, to adopt an item-based approach does not imply that languages are unsystematic bundles of loose, freely-circulating pieces. Nevertheless, this approach does have to provide an explicit account for the mapping of item to structured system.
Third, the notion of ‘a language’ can play a direct role in processes of transmission, in two important ways; first, to the extent that speakers’ metalinguistic awareness and ethnolinguistic identity can be an enabling or constraining factor (e.g., where speakers’ identification of a linguistic variant with ‘a language’ affects the variant’s model bias); second, to the extent that individuals construct mental representations of higher-order structured systems consisting of large inventories of interconnected linguistic items, where these higher-order systems play an enabling or constraining role as structural contexts for individual linguistic variants. This second sense of ‘a language’ refers to the individually-situated psychological object otherwise known as a grammar (in the sense of Chomsky 1965).


With this as background, I argue that the main challenge for evolutionary approaches to linguistic (and other cultural) transmission and change is to solve what I call the item/system problem: if we are to take the individual ‘item’ (or variant or character or element) as its unit – as I think we must – then how is this unit related to the notion of higher level systems like ‘languages’ which appear to show special properties above the ‘item’ level? I argue that the answer is to be found in a set of biases on social transmission (taking off from work by Boyd and Richerson), and in particular, in the structural relationship between diffusible elements and their contexts, whether these contexts be cognitive, artifactual, semiotic, or a combination of these. I will discuss these biases in a series of posts to come.



Chomsky, Noam A. 1965. Aspects of the theory of syntax. Cambridge, Mass: MIT Press.

Croft, William. 2000. Explaining language change: an evolutionary approach. Harlow: Longman.

Le Page, R. B. , and Andrée Tabouret-Keller. 1985. Acts of identity: creole-based approaches to language and ethnicity. Cambridge: Cambridge University Press.

Nettle, Daniel. 1999. Linguistic diversity. Oxford: Oxford University Press.


  • John Wilkins 22 August 2009 (16:10)

    The replicator notion is overly derived from genes by analogy, but it is not clear that genes or replicators are the sine qua non of evolution. Instead, a notion of Jim Griesemer’s – the reproducer – is better as a “unit” for cultural evolution. Recently, Peter Godfrey Smith revised the reproducer notion so that they are in effect defined by their being part of a Darwinian population: a reproducer is a physical entity that makes progeny. In language, this might be a phoneme, a grammatical structure, an accent, and so on, just so long as it is capable of being a parent. I recommend Godfrey Smith’s book as a basis for this problem.

  • John Hawks 26 August 2009 (01:45)

    I think you’ll find a lot of historical context from early-20th-century genetics useful to this issue. Mendel conceived his elementen very much as particles, and if there is a popular notion about genes, it is surely along the same lines. But Mendel’s theory is a theory of contrasts. The development of genetics after Fisher was, from a certain point of view, an attempt to isolate the effects of primary contrasts (of allelomorphs) from the (potentially) continuous variation resulting from environment and epistases. The solution in the linguistic case does not have to resemble the genetic case, but the same item/system problem is there: how does a working genome system evolve, tolerant of variation in its constituent parts, when the parts themselves are the units of inheritance?