Abstract
Here, we test Neutral models against the evolution of English word frequency and vocabulary at the corpus scale, as recorded in annual word frequencies from three centuries of English language books. Against these data, we test both static and dynamic predictions of two neutral models, including the relation between corpus size and vocabulary size, frequency distributions, and turnover within those frequency distributions. Although a commonly used Neutral model fails to replicate all these emergent properties at once, we Find that modified two-stage Neutral model does replicate the static and dynamic properties of the corpus data. This two-stage model is meant to represent a relatively small corpus of English books, analogous to a canon', sampled by an exponentially increasing corpus of books among the wider population of authors. More broadly, this model a smaller neutral model within a larger neutral model could represent more broadly those situations where mass attention is focused on a small subset of the cultural variants.
| Original language | English |
|---|---|
| Article number | 1750012 |
| Number of pages | 16 |
| Journal | Advances in Complex Systems |
| Volume | 20 |
| Issue number | 06n07 |
| DOIs | |
| Publication status | Published - 1 Sept 2017 |
Funding
We thank William Brock for comments on an early draft. RAB thanks the North-western Institute on Complex Systems for support as a visiting scholar. DR is supported by a grant from the Hobby School of Public A®airs, University of Houston and also by EPSRC grant to the Bristol Centre for Complexity Sciences (EP/I013717/1). AA was supported by a Royal Society Newton Fellowship at Bristol University entitled \Cultural evolution online"; PG was supported by the Leverhulme Trust grant on \Tipping Points" (F/00128/BF) awarded to Durham University.
Keywords
- Books
- Cultural evolution
- Heaps law
- Language evolution
- N grams
- Zipf's law