From Historical Corpus Data to Agent-Based Models
CALL FOR SUBMISSIONS NOW OPEN (see further below)
Theme of the workshop
Recently the scientific study of language origins and evolution has seen three important breakthroughs. First, a growing number of corpora of historical language data has become available. Although initially these corpora have been used to examine surface features only (for example the frequency and distribution of word occurrences), advances in statistical language processing now allow for the thorough examination of aspects of grammar, for example, how syntactic structure has progressively arisen in the history of Indo-European languages or how constructional choices have undergone change (e.g. Krug 2000; Bybee 2010; Sommerer 2010; Van de Velde 2010; Traugott & Trousdale, forthc.; Hilpert & Gries, ms.).
Second, agent-based models of the cognitive and cultural processes underlying the emergence and evolution of language have made a significant leap forward by using sophisticated, and therefore more realistic, representations of grammar and language processing (e.g. Van Trijp 2012; Beuls & Steels 2013), so that we can now go way beyond the lexicon-oriented experiments characteristic for the field a decade ago.
Finally, selectionist theorizing, which has given such tremendous power to evolutionary biology, is being applied increasingly to understand language evolution at the cultural level (Croft 2000; Ritt 2004; Mufwene 2008; Rosenbach et al. 2008; Landsbergen et al. 2010; Steels 2011). Researchers are beginning to look more closely at what selectionist criteria could drive the origins and change in grammatical paradigms and how new language strategies could arise through exaptation, recombination or mutation of existing strategies. The selectionist criteria are primarily based on achieving enough expressive power, maximizing communicative success, and minimizing cognitive effort (Van Trijp 2013).
The confluence of these three trends is beginning to give us sophisticated agent-based models which are empirically grounded in real corpus data and framed in a well-established theory of cultural evolution, thus leading to comprehensive scientific models of the grammaticalization processes underlying language emergence and evolution. All this is tremendously exciting. The goal of this workshop is to alert the community of researchers in language evolution to this important development and to show concrete research achievements demonstrating the current state of the art. It will act as a forum for exchanging tools and it will inquire what kind of open problems might be amenable to this approach, given the currently available data and the state of the art in computational linguistics tools for agent-based modeling. The workshop is intended to enable a deeper dialog between two communities (historical linguistics and computational linguistics) so that we can productively combine the very long tradition of empirical research from historical linguistics with the rigorous formalization and validation through simulation as practiced in agent-based modeling.
The workshop will as much as possible be based on real case studies. For example, how can we explain the current messy state of the German article system, given that old High German had a much clearer system? (van Trijp 2013) Is this development based on random drift or are there selectionist forces at work? How can we explain that Indo-European languages progressively developed a rich constituent structure with an increasing number of syntactic categories, a gradual incorporation of ‘floating’ words into phrases, and a loss of grammatical agreement? (Van de Velde 2009)? How can we explain the emergence of quantifiers out of adjectives? How can we explain the rise of a case system (Beuls & Steels 2013).
General research questions that are to be addressed:
- What are the processes that cause variation in populations of speakers?
- What are the processes that select variants to become dominant in a speech community?
- How do language strategies give rise to language systems?
- Which cognitive functions must the brain support in order to implement language strategies?
- What are good tools for doing empirically driven agent-based modeling?
Call for Submissions
We invite contributions (10′ talk + 5′ discussion) to one of the following three sessions in the workshop:
- Case Studies: historical data of emergence and evolution of grammatical phenomena and concrete agent-based models, or steps towards them.
- Tools: What is the state-of-the-art for historical linguistics corpora and tools extracting trends in grammatical evolution? What tools are available for building realistic agent-based models of grammaticalization?
- Cultural evolution theory: Which results from theoretical research in evolutionary biology can be exapted to advance cultural evolutionary linguistics?
Format of the submission. An extended abstract of max. 4 pages (including references) adhering to the Evolang stylesheet. Submissions should be e-mailed to email@example.com with the subject “Evolang workshop submission”.
- Deadline for submission:
1 March 2014, 23:59 CE
- Notification of acceptance: 15 March 2014
- Final submission: 1 April 2014
- Beuls, Katrien & Luc Steels. 2013. ‘Agent-based models of strategies for the emergence and evolution of grammatical agreement’. PLoS ONE 8(3), e58960. doi:10.1371/journal.pone.0058960.
- Bybee, Joan L. 2010. Language, Usage and Cognition. Cambridge: Cambridge University Press.
- Croft, William. 2000. Explaining language change. An evolutionary approach. Harlow: Longman.
- Hilpert, Martin & Stefan Th, Gries. Manuscript. ‘Quantitative approaches to diachronic corpus linguistics’. In: Merja Kytö & Päivi Pahta (eds.), The Cambridge handbook of English historical linguistics. Cambridge: Cambridge University Press.
- Krug, Manfred. 2000. Emerging English modals: a corpus-based study of grammaticalization. Berlin: Mouton de Gruyter.
- Frank Landsbergen, Robert Lachlan, Carel ten Cate & Arie Verhagen. ‘A cultural evolutionary model of patterns in semantic change’. Linguistics 48: 363-390.
- Ritt, Nikolaus. 2004. Selfish Sounds. A Darwinian Approach to Language Change. Cambridge: Cambridge University Press.
- Rosenbach, Anette. 2008. ‘Language Change as Cultural Evolution: Evolutionary Approaches to Language Change’. In: Regine Eckardt, Gerhard Jäger and Tonjes Veenstra (eds.), Variation, Selection, Development. Probing the Evolutionary Model of Language Chang. Berlin: Mouton de Gruyter, 23-72.
- Sommerer, L. 2011. ‘Old English se: from demonstrative to article. A usage-based study of nominal determination and category emergence’. PhD thesis, University of Vienna.
- Steels, Luc. 2011. ‘Modeling the cultural evolution of language’. Physics of Life Review 8: 339-356.
- Traugott, Elizabeth & Graeme Trousdale. Forthcoming. Constructionalization and constructional change. Cambridge: Cambridge University Press.
- Van de Velde, Freek. 2009. De nominale constituent. Structuur en geschiedenis. Leuven: Leuven University Press.
- Van de Velde, Freek. 2010. ‘The emergence of the determiner in the Dutch NP’. Linguistics 48: 263-299.
- van Trijp, Remi. 2012. ‘Not as awful as it seems: explaining German case through computational experiments in Fluid Construction grammar. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, 829-839.
- van Trijp, Remi. 2013. ‘Linguistic assessment criteria for explaining language change: a case study on syncretism in German definite articles’. Language Dynamics and Change 3(1): 105-132.