1
Author’s post-review version. Published in ‘Ye whom the charms of grammar please’: Studies
in English Language History in Honour of Leiv Egil Breivik, publisher: Peter Lang, editors:
Kari E. Haugland, Kevin McCafferty, Kristian A. Rusten, pp. 27-54. ISBN: 978-3-0343-
1779-5.
http://www.peterlang.com/index.cfm?event=cmp.ccc.seitenstruktur.detailseiten&seitentyp=pr
odukt&pk=81061&cid=5&concordeid=431779. Do not redistribute without permission.
In search of the S (curve) in there1
Gard B. Jenset
TORCH - The Oxford Research Centre in the Humanities, University of Oxford
Abstract: The present article considers the evolution of existential there in Old English, with a
focus on the mechanisms underlying the propagation of the change once the initial innovation
had taken place. Drawing on the modeling approach taken by Blythe and Croft (2012), it is
argued that social and structural factors colluded in the early stage propagation of existential
there. Alongside grammatical factors, it is argued that author prestige constituted a social
factor in the propagation of existential there. While this factor on its own is incapable of
explaining the emergence of existential there, it is shown that texts with one identified author
from the late Old English use the lexeme at higher rates. From a methodological perspective,
the article illustrates the value of advanced quantitative statistical modeling as a means to
integrate corpus data and historical linguistic theory.
1 The author would like to thank Jóhanna Barđdal, Tonya Kim Dewey, Barbara McGillivray,
and two anonymous reviewers for their helpful comments and suggestions; nevertheless,
responsibility for the final article rests with the author alone.
2
1 INTRODUCTION
The Present-day English distinction between existential and locative there, exemplified in (1)
and (2) with data from the spoken part of the Contemporary Corpus of American English or
COCA (Davies 2008), goes back to Old English (Breivik 1990). It is widely accepted that the
existential use of there diachronically originated from the locative adverb there, plausibly
through the interplay of syntactic, semantic, and pragmatic factors (Breivik 1990; 1997, Jenset
2010; 2014).
(1) there was a brand-new, just-made key left at the crime scene in the lock. (Existential)
(2) she was there, stuck in the house in New York. (Locative)
However, an important and as of yet unsettled question is how this innovation, once it had
occurred, propagated through the speech community. The essential question explored by the
present paper is: how did Old English þær (for convenience discussed in its modern form
there), once the initial reanalysis as a grammatical subject had taken place, survive and affirm
this new status?
1.1 The evolution of existential there in Old English
In Old English, a number of different constructions were available for expressing existential
or presentational propositions. Breivik (1990) defines the three primary constructions as
follows: type D with there, type C without there but requiring one in later stages of English
(i.e. a zero pronoun), and type A/B which consist of miscellaneous affirmative main clause
constructions. The types are exemplified in (3)–(5) below using examples from Breivik (1990:
198–199).
3
(3) Micel yfelnyss wæs on iudeiscum mannum. (type A/B)
Much evil was on Jewish men;
‘There was much evil in Jewish men.’ (ÆChom I XI:317–19)
(4) Wæs bi éastan þære ceastre welneah sumo cirice… (type C)
Was by east that town quite-near some church…;
‘There was close to the east close to the town a church…’ (Bede 62:2–3)
(5) and đær wæs án cyrce of scíndendum golde. (type D)
and there was one church of shining gold;
‘and there was a church of shining gold’ (ÆChom 1 XIII:196–97)
As established by Breivik (1990), the type C construction was gradually overtaken in
frequency by the type D construction with there over the course of the Old and Middle
English periods. Following work in Construction Grammar (Goldberg 1995 , Croft 2001), the
constructions in (3)–(5) can be considered form-meaning pairs in their own right, and the
reorganization, including changes in relative frequency, in the speech community can be seen
as constructional change (Hilpert 2013: 16). This theoretical positioning is not mere window
dressing, but serves to highlight an important point: rather than casting the emergence of
existential there as a question of the insertion of a lexeme into a grammatical slot, the
implication is that the entire construction should be investigated as a unit in its own right, with
distinct patterns of change.
1.2 Mechanisms of change
The present study follows Croft (2000: 185) in defining change as modified replication (i.e.
innovation), followed by differential replication of the modified variant, essentially defining
change as ‘innovation + replication’. It is probably fair to say that more has been written on
the re-analysis, or innovation, than on the replication (or propagation), as far as existential
4
there is concerned. Syntactic, semantic, and pragmatic factors are all plausible causes for in
initial innovation; however, that is not to say that they are equally capable of explaining the
propagation of the innovation.
The question of which mechanisms operate to propagate linguistic change, and how
they operate, is still far from settled (Blythe & Croft 2012:269). Blythe and Croft distinguish
two broad categories, namely social factors and the frequency of interaction between
interlocutors (2012: 269–270). So far, explanations for the evolution of existential there have
tended to fall in the frequency category. Breivik (1990) suggests that the propagation of the
change, once the initial reanalysis had been achieved, passed through an essentially
deterministic typological process whereby the combined pressure from pragmatic context and
syntactic changes would push the frequency of use from one typological category to another
(1990: 247–249). Other studies dealing with related problems have tended to take a similar
approach. Williams (2000) explains the sudden disappearance of empty expletive subjects in
Middle English by the disappearance of verb initial word order; an explanation which is
analogous to the weight given to the shift from verb second to verb third order in Breivik
(1990).
Later studies on the topic have tended to build on the same line of argumentation.
Breivik (1997) adds a valuable cognitive linguistic dimension to the question of reanalysis,
and Breivik & Swan (2000) extend the argument in the spirit of frequency of interaction by
proposing that the grammaticalization of there proceeded through ‘repeated use in local
syntactic contexts’ (2000:27). Jenset (2010) builds on the same line of reasoning by arguing
that differentiated frequencies of use involving both the linguistic context and there would
gradually shift the cognitive interpretation in favor of existential there via perceptual bias.
The argumentation nevertheless rests on a view which can be considered deterministic in that
there is no room for social factors like identity or prestige (Baxter et al. 2009:258).
5
However, any deterministic model of the development of existential there must face
the questions posed in Baxter et al. (2009) and Blythe & Croft (2012), namely: does it fit the
data? It is well known that many linguistic phenomena follow an S-curve in their diachronic
development, leading some to consider it a default pattern for language change (Chambers
2002:361). See also the discussion in Blythe & Croft (2012: 278–281). The S-curve is attested
in the development of existential there by Breivik (1990: 226), a result corroborated by
findings in Jenset (2010: 273) with a partially different and larger dataset.
Blythe and Croft (2012) present a mathematical model of linguistic change where they
simulate the different assumptions made by various attempts to explain the propagation of
change, i.e. the S-curve. Their results are clear: only an explanatory model that involves a
differential social weighting of competing variants will result in an S-curve under any
reasonable assumptions. A key point in their argumentation is that the variants themselves are
associated with social valuations. According to Blythe and Croft’s model, the S-curve is in
other words not the result of social accommodation and interaction as argued by others
(Nevalainen & Raumolin-Brunberg 2003: 53–55).
The question of the diachronic evolution of existential there can then be broken down
into two sub-questions, corresponding to the notions of innovation or actuation, and
propagation or diffusion (Croft 2000:4–5):
1. What led to the reanalysis of locative there as an existential subject?
2. What led to the propagation of this reanalysis?
Explaining the S-curve properly falls under the domain of question 2, which will serve as the
focus for the present study. However, the study will refrain from going beyond Old English,
both for reasons of space but also because the early phase of propagation is crucial under
Blythe & Croft’s model (2012: 292).
6
1.3 Data and organization
The present study draws its data from the York-Toronto-Helsinki Parsed Corpus of Old
English Prose or YCOE (Taylor et al. 2003), and more specifically from a dataset that was
collected and described in more detail in Jenset (2010). The dataset has been enriched with
more information for the purposes of the present study, as explained below in section 4. Since
the YCOE annotation does not distinguish between existential and locative uses of there, the
distinction between them has been made based on a machine learning algorithm. Jenset (2010:
254–260) describes the procedure that was used for that study, but the for the present study a
new, improved model was employed (Jenset To appear), yielding improved results more
similar to the those reported by Breivik (1990).
The paper is organized as follows: section 2 provides more details on the evolutionary
model that underlies Blythe and Croft’s mathematical model. Section 3 presents an
exploratory step discussing potential candidates for social factors that could plausibly be
involved in the evolution of existential there. Section 4 discusses a statistical model of the
data and illustrates how social and linguistic factors can be accommodated side by side as
explanatory factors. Section 5 pulls the threads from sections 3 and 4 together, while section 6
presents the summary conclusions.
2 REPLICATION MECHANISMS
Blythe & Croft (2012) build on Croft’s evolutionary framework for language change (2000)
in their assumptions. They see language as a complex, adaptive system in which speakers
interact and where the speakers’ past behavior constrains current and future behavior as a
7
consequence of competing factors (Blythe & Croft 2009:47–48). The competing factors can
be considered selectors, or competing differential causes, acting upon replicators (linguistic
structures, i.e. that which is being replicated or passed on). The speaker is modeled in terms of
an interactor, i.e. a vehicle through which the environment can cause differentiated replication
(Croft 2000: 20–40).
Building upon this framework, Blythe & Croft (2012) propose a typology of
mechanisms for propagation of language change in a way that is suitable for mathematical
modeling. Their four mechanisms are as follows:
1. Neutral evolution: linguistic change without any social mechanisms. As an example,
Blythe and Croft give the fixation on the most frequent variant in a situation with fluctuation
between different variants (2012: 275–276).
2. Neutral interactor selection: change driven by the frequency of interaction among
interlocutors, i.e. the strength of the ties between. Social network structures, without any
social differentiation assigned to the linguistic variants themselves, are given as an example
under which this change could occur (2012: 273–274).
3. Weighted interactor selection: change driven by the asymmetry in social prestige between
innovators and followers. The crucial property of this mechanism is the identity of the group
using a given variant, not the variant itself (2012: 274–275).
4. Replicator selection: change driven by the prestige directly associate with linguistic forms
themselves. In other words, the linguistic forms are not considered prestigious because they
are associated to one particular group; instead the forms (or replicators) take on the prestige
directly, possibly through socio-economic differences and social mobility (2012: 272–273).
8
An important property of these four mechanism-specifications is that they are
accompanied by very detailed specifications, glossed over in the presentation above, that are
implemented in mathematical models simulating the behavior of speakers under the various
conditions. Crucially, Blythe and Croft find that only under the assumptions of replicator
selection can a convincing S-curve easily be arrived at (2012: 291–292). Their argument is
not that mechanisms 1-3 cannot be attested or are not relevant; merely that mechanisms 1-3
do not, under their simulations, result in a propagation pattern resembling an S-curve. Since
even a purely frequency-driven, neutral change must still operate among speakers
communicating with each other, the model implication of this is that each speaker is given the
same weight or influence on the other during interaction. Again, the point is not that such
scenarios do not take place, but that obtaining an S-curved pattern of change proves difficult
under any reasonable assumptions.
Blythe and Croft note that the change rate of an innovation will depend on how that
inovation is distributed in the community (2012: 297). If the propagation depends only on the
identity of the innovator, the propagation tends to follow a different path than in replicator
selection, i.e. when the variant itself is associated with a differential social weighting (2012:
297–298).
In short, Blythe and Croft specify the assumptions under which an S-curve can be
achieved (or expected) in painstaking detail, and reach the conclusion that some sort of social
weighting of the variant itself is required for the innovation to not only spread initially but
also to complete the full S-curve trajectory. Taking Blythe and Croft’s approach as a start, the
following section will explore differential distributions of existential there in YCOE based on
social variables.
9
3 IN SEARCH OF DIFFERENCES
Given the argumentation in Blythe & Croft (2012), it is reasonable to look for social variables
that might explain the observed S-curve found in the evolution of existential there. However,
the possibility that frequency of interaction is involved should not be dismissed out of hand.
Since frequency of interaction is very hard to estimate from a written historical corpus, corpus
size was used as a proxy. If the driving mechanism behind frequency of interaction is a higher
degree of interaction with some speakers and a lower degree of interaction with others (Blythe
& Croft 2012: 270–271) then the total corpus size might serve as a possible indicator variable.
Although some degree of chance is involved, there were doubtlessly ‘social preferences
attitudes’ (Labov 2001: 191) at work with respect to who wrote what during the Middle Ages.
The argument can also be extended to which texts were preserved.
With this in mind, the proportion of existential uses of there (out of all instances of
there) was calculated for the Old English, Middle English, and Early Modern English data.
The corpus was measured in the number of sentences rather than words, since this
corresponds directly to how instances of there were collected (Jenset 2010). However, a
Spearman rank correlation test, a robust non-parametric test of correlation, showed that there
was no correlation between corpus size and the proportion of existential uses of there in the
corpus (p = 1, not significant). Furthermore, it is worth noting that mechanism 1 (neutral
evolution by frequency of use) is less plausible when it comes to existential there, since the
data attest to a move away from the most frequent variant (type C) to a minority variant (type
D, i.e. the construction with there). Combined, these considerations strengthen the case for the
Blythe-Croft model with respect to the S-curve observed in the evolution of existential there.
The question is: which social variable (or variables) was involved? In answering this question
the focus will be on the Old English period.
10
The natural categories to consider are of course social class, sex and gender, social
context, and ethnicity (Trudgill 2002: 373). However, in historical corpus linguistics the
parameters of the social context, as much as the linguistic variables themselves, are strictly
determined by who wrote and what was preserved (Schneider 2002: 89–90). Since the
material in question originated within a narrowly circumscribed literary class that was
essentially the purview of men, the concern here will be with social context and ethnicity. The
YCOE documentation provides information which can serve as indicators of social context
(viz. genre, and the text itself which at least to some degree reflects one or more authors) as
well as ethnicity (viz. dialect).
3.1 Exploring corpus data through dispersion
Having identified some candidate variables for social differentiation of variants, the next
question is how to choose between them. The problem can be reformulated as follows: which
social variable stands sufficiently out from the others with respect to existential uses of there
to make it a plausible explanatory factor for the social differentiation assumed to underlie the
observed S-curve?
In corpus linguistics variation is typically reported in terms of frequency of co-
occurrence or a derived measure of association such as Z-score or mutual information (MI),
see Schmitt (2010: 124–132) for an overview. However, Gries (2008) points out that co-
occurrence in itself can be extremely problematic when the dispersion of the item(s) in
question is not properly taken into account. If a term t occurs 100 times in a corpus with 100
parts we would expect to find more or less 1 instance in each corpus part if the term is a
general one. If all the 100 instances of t were found in only one of the 100 corpus parts, it
might suggest that it has a specialized meaning or that it has taken on certain special
connotations. This argumentation can be applied directly to the question of existential uses of
11
there and social context, where the corpus parts are the various categories within each
indicator of social context and ethnicity.
Ethnicity and social context are not recorded directly in the corpus data, but the corpus
meta-data provides some relevant information. Ethnicity is no clear-cut, unproblematic
category which stands in a one to one relationship with language (Fought 2002). However, it
is probably fair to say that both language and ethnicity express identity in ways that
sometimes overlap, and that language can express national but also regional ethnicity
(Chambers 2002: 362–364). The corpus meta-data provides information about the dialect of
Old English used, which, together with information about the genre and the text itself, are
used as categories over which the dispersion of existential there is gauged.
Several measures of dispersion exist (Gries 2008: 406–410). The present study uses
the measure introduced in Gries (2008), DEVIANCE OF PROPORTIONS or DP, since it is
intuitive, robust and conceptually simple, and has a degree of sensitivity that many other
measures lack (Gries 2008: 417–419). The fundamental idea is to compare the size of each
corpus part (relative to the overall corpus) to the proportion of term t in that part, and to sum
the differences to obtain a single value. In more technical terms, we can define DP as the sum
of the absolute differences between observed and expected proportions divided by 2, see also
the formula below:
The expected proportion (EP) is the size of each corpus part relative to the size of the corpus,
whereas the observed proportion (OP) is the number of instances of terms t in the corpus part
(denoted by subscript i), relative to all instances of t in the corpus as a whole. A low DP value
12
indicates that the term is spread fairly evenly across the corpus parts in question, whereas a
higher DP value indicates that the term is spread unevenly, that is, it tends to lump together in
some parts of the corpus. To interpret the magnitude of DP in more detail, Gries (2008: 420–
421) provides DP results applied to data sampled from the BNC. The results range from 0.08
for the indefinite article a (obviously very evenly dispersed) to 0.99 for rare and specialized
lexemes like macari and mamluks. Of greater practical importance is perhaps the
establishment of cutoff points in between these values near the theoretical endpoints of 0 and
1. In his data, Gries classifies intermediate DP values as those between the first (0.27) and
third (0.93) quartiles, with observed intermediate values ranging from 0.41 to 0.80. It should
be clear that this taxonomy is a heuristic, but it is nevertheless valuable as a baseline for
comparison when judging the magnitude of DP values.
3.2 The dispersion of existential there in Old English
Using the formula above, a DP value for existential uses of there in the YCOE material was
calculated based on the following corpus parts (DP value parenthesis): dialect (0.12),
religious vs. secular texts (0.19), genre (0.25), text (0.40). The plot in figure 1 shows these
values plotted against a horizontal line representing a cutoff between the first and second
quartiles in the data presented in Gries (2008: 420). Like the subsequent analyses in this
paper, the calculations and the creation of the plot was done with the statistical package R (R
Development Core Team 2011). For exploratory purposes we can loosely define the cutoff
between the first and second quartiles as the dividing line between a small and a medium
deviance from the expected proportions. The DP value for individual texts clearly stands out
from the other three categories being the only one to fall into the intermediate range suggested
by Gries (2008: 420–421). Since a higher DP value represents a more uneven dispersion, the
interpretation is as follows: existential uses of there are fairly evenly distributed over all the
13
parts of the corpus when we divide it up based on dialect, genre, or whether or not we have a
religious text. However, if we divide the corpus by individual texts, we find that existential
uses of there are distributed less evenly.
Figure 1: Barplot showing the difference in magnitude of DP values for existential there.
Each bar represents one of the four social context variables that could be derived from the
corpus meta-data. A higher value indicates that occurrences of existential there are unevenly
distributed among the corpus parts represented by the category. The dotted horizontal line
represents a heuristic cutoff point in the form of the value of the first quartile in the data
discussed by Gries (2008).
The plot in figure 1 suggests that existential there is found disproportionally in some texts
compared to others, in other words: some authors would use the lexeme more often than
others. Common function words and light verbs fall in the lowest range of DP values in Gries’
data (2008: 421), and this is where we would intuitively expect existential uses of there to fall
14
too. Since the dispersion of existential uses of there by text falls in the intermediate range of
DP values, we can hypothesize that something induced different usage rates of the lexeme in
different authors. I propose that this ‘something’ was a difference in social valuation, cf. the
argument in Blythe & Croft (2012).
The DP metric should primarily be considered an exploratory tool. It gives an
indication of uneven distributions, but it does not inform us directly as to why that might be
case. Furthermore, as the discussion above indicates, the DP measure of dispersion is not a
formal test statistic associated with a null-hypothesis test that can be expressed as a p-value. A
formal null-hypothesis test of significance requires a formalized hypothesis to be meaningful
and interpretable, and this makes it clear precisely why the DP measure is very useful in data
exploration. Initially, i.e. without further information, our best guess would be that a lexeme is
evenly distributed. However, as a formalized hypothesis test this assumption is highly
underspecified. It is reasonable to suspect that a lexeme might be unevenly distributed in
some way, but without specific assumptions to test (either based on theory or previous
research) it is simply premature to employ a formal null-hypothesis test. Instead, the DP
measure quantifies the degree to which the observed data differs from the simplistic a priori
assumption. Based on the patterns highlighted by the DP measure, it is then possible to
construct a more realistic hypothesis which is more readily testable with formalized tests and
models.
It would strengthen the assumption made above if the dispersion of existential uses of
there could be compared to a competing variant with a lower DP value. The dominant type of
existential construction in Old English was, type C, i.e. the one without there (Breivik 1990:
227). The construction is characterized by a verb-initial linear order, although the verb might
be preceded by a negative particle or an adverbial of time or place (Breivik 1990:193). As the
examples in Breivik (1990: 199–200) attest to, the construction is sufficiently heterogeneous
15
to make automated searched in YCOE difficult, since the corpus is not annotated for
constructions. However, one sub-type is reasonably easy to identify, namely cases with an
initial negative particle with a cliticized form of be followed by a nominative NP, which is a
sub-type of existential constructions without there, cf. also the definition of Breivik’s type C
pattern (Breivik 1990: 192–194) exemplified by (4) in section 1 above. An example of this
construction from Wulfstan’s Homilies, found in YCOE, is shown in (6) below:
(6) Nis nan swá yfel scaþa swá is déofol silf;
not-is none so evil enemy as is devil self;
‘There is no enemy so evil as the devil himself.’ (cowulf,WHom_16b:29.1366)
Searching YCOE for this pattern using CorpusSearch 2.0 yielded 633 sentences. Employing
the same formula for DP as above, the resulting DP value for cliticized type C constructions
by text was 0.29. In other words, the cliticized type C existential construction is more evenly
distributed among the texts in the corpus than the type D construction with existential there.
This result strengthens the assumptions made above, since it demonstrates that the, in
diachronic terms, outgoing cliticized type C variant (Breivik 1990: 226–227) was generally
more widely used compared to the new variant with there in YCOE. Incidentally, the result
also demonstrates that the differences between categories in figure 1 are not an artifact of the
choice of text as the category used to partition the corpus. Although this difference does not in
itself establish that there was a difference in social valuation of the variants, it strongly
suggests that the difference in Old English between existential constructions with and without
there is not merely a question of magnitude as established by Breivik (1990: 224). The two
competing constructions also differ in their distribution among authors, i.e. who used it. The
16
next section deals with the question of how this difference can be linked to the social
valuation or weighting of competing variants.
4 AUTHORS, NORMS, AND CONSTRUCTIONS
The explorative steps outlined in the previous section indicated that Old English authors
differed in their use of existential there, exceeding the differences arising from genres or
dialects. A similar, but weaker effect could be seen for existential constructions lacking an
existential pronoun. This raises the question of whether the two are in a complimentary
distribution. This question is interesting since it offers a glimpse into the social aspects of the
grammatical system. If the distinction cannot be explained by grammatical or stylistic factors,
we have some provisional reason for exploring explanations based on prestige, alongside
author idiosyncrasies.
Section 3 demonstrated that the use of existential there in Old English is fairly
heterogeneously dispersed among the texts in the YCOE corpus. This contrasts with an
example of the dominant existential construction without there whose dispersion is much
more homogeneous. Furthermore, it was suggested that this highlighted not only a difference
in how much the two constructions were used, but in who used them, i.e. a form of social
differentiation. Although the argument presented above is valid as far as it goes, it does not
present direct evidence that differential social weighting was involved. In short, although the
argument is plausible, it could certainly be supported better. The present section discusses
how the link between text/author, rates of existential uses of there, and differential social
weighting can be made more explicit.
4.1 Standardization and differential social weighting
17
At the end of the tenth century and beginning of the eleventh century there were attempts
being made to standardize both Old English spelling and grammar by the so-called
‘Winchester School’, notably represented by Ælfric (Gneuss 1972, Gretsch 2009). Although
this standard, or ‘preferred usage’, seems to have been disseminated with some degree of
systematicity (Irvine 2006: 49), it is perhaps more accurate to think of it as a supra-regional
model concentrated around centers of learning and scholarship, as was the case for other elite-
driven prestige variants elsewhere in Europe (Salmons 2012: 159). Nevertheless,
standardization toward a socially prestigious norm would provide a plausible context for the
social weighting required in Blythe and Croft’s model. The next section describes how we can
measure the degree of such standardization efforts.
A good case for the importance for standardization would present itself if texts that
can be linked to standardization efforts could be shown to be using the construction with
existential there more often than expected, and if those texts are less likely to use the type C
construction. In short: are the type C and type D constructions in complimentary distribution,
and are any texts linked to standardization efforts aligned with those distributions? To decide
whether constructions with or without there were in a complimentary distribution in Old
English, the data should preferably include a full range of the various types of existential
constructions, some of which are not easily retrieved from a corpus without manual
inspection. For this reason, data from Breivik (1990:198) will be considered. The data
consists of Old English clauses classified as existential and belonging to one of the three types
exemplified in (3)–(5) above: with existential there (type D), with a zero pronoun (type C), or
other clauses (type A/B).
18
Table 1: Frequency counts of three types of existential constructions in Old English texts: Bede’s Ecclesiastical
History, the Blickling Homilies, the Anglo Saxon Chronicle, the Book of Exeter, the Old English Orosius, and
Ælfric’s Catholic Homilies. After Breivik (1990: 198).
Other (A/B) Ø (C) There (D)
Bede 14 97 1
Blickling 13 33 12
ASc 6 31 4
Exeter 8 29 4
OR 13 73 21
Ælfric 27 123 39
The frequencies in table 1 can be conveniently visualized in an association plot, where the
dotted lines represent expected frequencies, the height of the bars represent (positive or
negative) deviations from the expected frequencies, and the width of the bars represent the
amount of evidence supporting the observation. For more details on association plots see
Jenset (2010: 87–88) and references therein.
19
Figure 2: Association plot displaying deviations from expected frequencies for three types of
existential constructions in Old English texts, based on the frequencies in table 1.Gray bars
represent positive deviations; black bars negative deviations.
The plot in figure 2 is interesting for several reasons. Firstly, it shows that existential
clauses with an initial NP (Breivik’s type A/B) share no particular relationship with the other
two types as far as distributions are concerned: there is no symmetry between over- or under-
representation of type A/B clauses and the other two types. Furthermore, deviations from the
expected values are small. Secondly, we see that existential constructions with there (type D)
and those with zero pronouns (type C) are indeed in complimentary distribution. When one of
the two is overrepresented in a text, the other is underrepresented. Furthermore, the strength
of the evidence as shown by the height and width of the bars lends plausibility to the
interpretation that this reflects real differences. Finally, we see that for compiled works such
as the Anglo Saxon Chronicle and the Book of Exeter, the differences are much smaller than
20
for single author works, such as Ælfric’s works and the Old English translation of Bede. This
raises the question of whether such differences have arisen simply because of author
idiosyncrasies which have cancelled themselves out in the multi-author works, or because of a
more systematic difference.
The differences and similarities can be summarized more succinctly using a technique
called CORRESPONDENCE ANALYSIS (CA). CA is a technique for reducing the variation in
frequencies in a multivariate matrix to its most salient patterns of association in a manner that
lends itself to visualization in a two-dimensional graph, a biplot. For further technical details
see Greenacre (2007); see Baayen (2008: 128–136) or Jenset & McGillivray (2012) for
introductions to CA with a more linguistic focus.
Figure 3: CA biplot of the frequencies in table 1. The plot captures the data well, explaining
100% of the variation in the data. The Blickling Homilies, the Old English Orosius, and
21
Ælfric’s Catholic Homilies are notable in their tendency to use existential constructions with
there.
The CA biplot in figure 3, created with the ca package in R (Greenacre & Nenadic
2007), summarizes the most salient patterns in the data in a two-dimensional subspace. The
horizontal axis, which by definition is the most influential dimension (in this case accounting
for 88.1% of the total variation), is defined by the opposition between constructions with there
and those with the zero pronoun. The translation of Bede, the Book of Exeter, and the Anglo
Saxon Chronicle all tend to use the type C construction, while Ælfric, the Old English
Orosius, and the Blickling Homilies tend to use the type D construction with existential there.
The second (vertical) dimension accounts for a mere 11.9% of the total variation, and is hence
of less importance for this analysis: the dominant feature of the data is the there - zero
pronoun opposition
The data behind the biplot in figure 3 is based on data that were collected and
annotated independently of the data used in the present study, even if there is some overlap in
the sources. For this reason, the data from Breivik (1990) provide an excellent comparison
case. In short, observing the same pattern in another, partially overlapping, dataset annotated
with different methods, should lead to increased confidence in the result.
Turning to the data from YCOE, a small excerpt of which can be seen in table 2, we would
expect to see more or less the same pattern as in figure 3 when constructing a new biplot
based on the YCOE data. The data are annotated somewhat differently in YCOE, the
categories differ somewhat, and the categories used in this analysis are frequencies per author
or manuscript of locative adverbs, of existential there, of forms of be co-occurring with a
locative adverb or existential there, and lexemes other than be occurring in the same position.
22
The resulting 4 x 60 matrix is difficult to interpret directly, and we instead turn to the CA
biplot in figure 4.
Table 2: Example excerpt from the matrix containing frequencies of locative adverbs,
existential there, and their contexts by text in YCOE. The full matrix has 60 rows.
coadrian cobede coblick cobyrhtf
LOCATIVE 4 437 270 47
EXISTENTIAL 0 18 15 10
NON-BE (context) 4 423 242 36
BE (context) 0 32 43 21
The biplot in figure 4 is a good visualization from the point of view of variation accounted
for: all the variation can be captured in two dimensions, with the first (horizontal) dimension
accounting for 85.8%. The alignment of existential there with forms of be is taken as a proxy
for the type D construction, cf. also the argument from Construction Grammar presented in
section 1.In the center of the biplot we find locative adverbs and non-be contexts, alongside
the majority of texts. The horizontal dimension is defined by the opposition between
existential there occurring together with forms of be, and the other categories. On the right
hand side of the plot, associated with existential there and be, we find among others the Old
English Orosius (cooriosiu), the West-Saxon Gospels (cowsgosp), the homilies of Wulfstan
(cowulf), Byrhtferth’s Manual (cobyrhtf), and a number of texts by Ælfric.This result is
strikingly similar to the results in figure 3 based on the independently collected and analyzed
data from Breivik (1990).
23
Figure 4: CA biplot of the YCOE frequencies exemplified in table 2. The biplot captures the
data well, explaining 100% of the variation. Texts on the right-hand side of the biplot are
more associated with the existential there constructions.
Notably, other texts from the same period, i.e. the first half of the eleventh century, are
not particularly associated with existential there. These include translations of the dialogues
of Gregory of Tours (cogregdC and cogregdH, visible in the lower half of the plot in figure 4),
as well as parts of the Anglo-Saxon Chronicle from the same period (located in the center, not
visible). The pattern in the plot cannot easily be attributed to a specific genre: in addition to
the Gospels, we find homilies (Wulfstan, Ælfric), instructional material (Byrhtferth), and
letters and prefaces (Ælfric).
4.2 Authors and authority
24
A striking feature of the plot in figure 4 is that many of the texts that are associated with
existential there are texts written or translated by a single identified author. If such an
association was found for one author, it might plausibly be considered an idiosyncratic feature
associated with that author. However, this is less plausible when the same association is found
in a number of texts by different authors. To estimate the effect, if any, of a single identified
author a BINARY LOGISTIC REGRESSION model was used. This approach allows us to
simultaneously compare the association between identified authors and the use of existential
there to structural variables, such as grammatical features. The relationship between a binary
outcome, in this case an existential use of there or a locative adverb (the most frequent of
which is there), can be considered a function of a number of predictor variables, and by using
a regression model we can investigate the specific properties of this function. For more
detailed discussions and examples of regression models in historical linguistics, see Baayen
(2008: 218–222), Gries & Hilpert (2010), or Jenset (2010; 2014).
The following model was used:
Model: Response (probability of existential there) modeled as depending on:
fixed effects: co-occurrence with be (BeContext) + overall log-transformed complexity of the
sentence (LogComplexity) + interaction between co-occurrence with be and complexity + co-
occurrence with a nominative NP (NomNP) + single author (OneAuthor).
The model was a good fit to the data judging from diagnostic plots of residuals vs. fitted
values (not shown). Nagelkerke’s pseudo R2 was 0.75, indicating that the model covers much
of the variation in the data and further corroborating the impression of a good fit. Finally, a
likelihood ratio test (Baayen 2008: 253) comparing models with and without the OneAuthor
25
predictor indicated that the inclusion of the predictor in the model is warranted. However, it is
worth noting that in a model where OneAuthor is the only predictor, the fit is less good and
the predictor itself is not significant. This implies that this particular predictor only makes an
explanatory contribution when included alongside predictors representing grammatical
variables. Put differently, whatever social differentiation it represents, it fills in a gap left by
the predictors representing grammatical structure, but it is insufficient on its own.
Table 3: Summary of the logistic regression model. The coefficients indicate the effect each
predictor has on the log odds-ratio of finding an instance of existential there compared to a
locative adverb.
Coef. Std. Error z-value p-value
Intercept -6.4741 0.3491 -18.55 > 0.0001
BeContext 1.7300 0.4454 3.88 0.0001
LogComplexity 0.6685 0.1980 3.38 0.0007
NomNP 1.7092 0.2462 6.94 > 0.0001
OneAuthor 0.6667 0.2304 2.89 0.0038
BeContext LogComplexity 3.6093 0.3657 9.87 > 0.0001
The logistic regression model is summed up in table 3. The analysis shows that the
OneAuthor variable, like the other variables, is statistically significant, with a margin of error
that is smaller than the effect itself, but how does this compare to the other predictors in the
model? Apart from the intercept, all the coefficients are positive, which means that they
increase the probability of existential there. However, the intercept is a baseline: the
likelihood of existential there when all the predictors are kept at their reference levels (false
for binary variables and zero for numeric ones). Hence, the intercept is less interesting than
the degree to which the predictors are associated with an increased likelihood of existential
26
there from this baseline. The coefficients in the table are log-transformed odds-ratios, which
complicates interpretation. However, a simple rule of thumb is to divide the coefficient by
four, since this will provide an approximation to the maximum impact, or effect, that the
variable in question has on the outcome (Gelman & Hill 2007: 82). Since the model specifies
an interaction term for BeContext and LogComplexity, this coefficient needs to be taken into
account for those predictors. Adding the BeContext and the interaction coefficients, we get
5.3393 which, divided by four and rounded off, corresponds to a 133% increase in the
probability of existential there. For LogComplexity (again, with the interaction term added)
we get an increase of about 100%, whereas co-occurrence with a nominative NP is associated
with a 43% increase in the likelihood of existential there. Finally, texts written by one,
identified author represent a 17% increase in the likelihood of existential there.
In summary, the present section has shown that the type C and D constructions are in
complimentary distribution, that results based on partially independent data from Breivik
(1990) and YCOE lead to very similar conclusions, and that late Old English texts with one
identifiable author are particularly likely to use the type D construction with there.
Furthermore, these authors represent both individual prestige and can be linked to
standardization efforts (Gneuss 1972, Hogg 2002: 120). The next section will discuss these
results in light of the Blythe-Croft model presented in section 2.
5 DISCUSSION
The explorative steps in section 3 above indicated that individual authors used existential
there at different rates. In section 4, we could further pinpoint this to single authored texts in
the late Old English period in the eleventh century using CA. With a logistic regression
model, it was found that the variable indicating whether a text was written by a single,
27
identified author was a significant, albeit small, predictor of the occurrence of existential
there. Furthermore, the OneAuthor predictor was only significant when included alongside
predictors representing grammatical variables. The analysis also revealed that several authors
followed the same pattern, reducing the likelihood that this was caused by individual
idiosyncracies. Finally, the differences found among texts from the same time period suggests
that the observed differences in the use of type C and type D constructions were not
completely homogenous in the late Old English period, leaving room for other explanations
such as social prestige and normative efforts.
The analysis presented above lends plausibility to the proposal made by Blythe &
Croft (2012) regarding the importance of social factors for explaining the S-shaped trajectory
often found in linguistic change. More importantly, it tests the proposal and provides a more
nuanced picture by measuring the relative importance of the variables involved. As such, the
study testifies to the importance of both social and linguistic factors, and highlights the need
for multivariate statistical methods in historical linguistics. While the grammatical and
cognitive factors discussed by Breivik (1990; 1997) and Jenset (2010; 2014) regarding the
evolution of existential there are doubtlessly important, the fact that they manifest themselves
in larger effect sizes (cf. table 3 above) should not lead us to a situation where social factors
are dismissed. As the present study attests to, grammatical and social factors can be studied
side by side; the Blythe-Croft model provides the theoretical motivation for doing so.
The results from the present study also raise the pertinent question of what kinds of
effect sizes should be expected in theoretically motivated empirical explorations in historical
linguistics. It is perhaps easy to focus on large effects, and sometimes this is doubtlessly
correct. However, the gradual nature of linguistic change coupled with the need to coordinate
linguistic norms across speakers and communities for effective communication suggests that
variables with smaller effects (such as the OneAuthor predictor in the model discussed in
28
section 4) should also be considered, simply because relatively large changes might have
modest causes in a complex system.
It should be emphasized, however, that the analysis above is based on Blythe &
Croft’s richly elaborated model, which again has explicit theoretical motivations. Thus, it is
not merely the convenient singling out of a small but statistically significant variable. It is
precisely within models such as Blythe and Croft’s or larger frameworks such as evolutionary
approaches to linguistic change that small effects can be interpreted meaningfully and can
provide valuable contributions. It is also worth pointing out that the present analysis based on
Blythe & Croft (2012) complements, rather than replaces, the analyses put forth in Breivik
(1990; 1997) and Jenset (2010; 2014). A key premise underlying the present study is the need
for simultaneously considering the multiple factors involved in both the innovation and
propagation stages of constructional change.
In the case of the constructional change taking place regarding the type C and type D
existential constructions in Old English, the analysis above coupled with the Blythe-Croft
model suggests pockets, or centers, where certain circles might plausibly have considered the
there-construction a norm conveying some prestige. According to the framework, the prestige
need not be shared by all speakers in the area; it is to be expected that linguistic prestige in the
Middle Ages as manifested in written manuscripts would have been limited to certain circles
rather than the general population (Salmons 2012: 159). As the gained increased conventional
status its association with different groups within a speech community would have gradually
increased (Croft 2000: 177).
The analysis above leaves unanswered the question of how the social differentiation
possibly involved in the replication (or propagation) of existential there in Old English might
have influenced later stages of the history of English. This omission is intentional, since it
falls outside the scope of the current article. Furthermore, the dialect diversity and lack of
29
standardization in Middle English following the Norman Conquest (Corrie 2006) would
require a somewhat different approach from that taken in the present paper. However, it is not
implausible that the texts written in Old English might have had some influence on the
language of later texts. In the Midlands, the area where standardization of Middle English
began from the fourteenth century, a link to the literary tradition of Old English was
maintained through the study and copying of older texts (Corrie 2006: 109–110). Admittedly
somewhat speculative, the preliminary results discussed in the present study can be recast as a
testable prediction for later stages of English: if a differential social weighting is crucially
involved in the propagation of existential there, we would expect its increasing probability to
coincide with the increasing social prestige (and standardization) of Midlands English after
the middle of the fourteenth century (Corrie 2006). However, the prediction is not merely a
question of geography. It fits well with an evolutionary model of social propagation (Croft
2000: 193–194) where communicative interaction ensures that novel forms spread between
population centers, a model which is also congruent with the results presented for Old English
above.
6 CONCLUSIONS
The investigation above suggests a highly likely case for the view that differential social
weighting, or replicator selection, played a non-trivial role in the early propagation of the Old
English existential construction with there, i.e. type D, to the detriment of the type C
construction. This result contributes to answering the question of how there, once it had been
reanalyzed as an existential subject, kept on being reanalyzed with the construction it
appeared in eventually replacing the former majority variant. For Old English, the period
studied here, it seems clear that Blythe and Croft’s model of replicator selection can lead to
30
new insights about the process of constructional change. We can find empirically motivated
indicators of differential social weighting in the data, possibly driven by the social prestige
associated with some groups and writers.
Crucially, the explanation developed above does not exclude or do away with
linguistic variables. Rather, the differential social weighting is operationalized and adapted to
an empirical framework which allows us to keep linguistic and social variables side by side.
As such, the analysis presented in the present paper builds on the results from Breivik (1990)
by demonstrating how structural factors involved in an innovation might be qualified by
social factors in the process of propagation. However, Blythe and Croft’s model is elegant and
persuasive, but it is still just a model. Empirical studies are required to ascertain to what
degree the map fits the terrain. In looking for social factors to explain linguistic change, it is
important to not throw the linguistic baby out with the bath water. The interaction, or perhaps
competition, between social and linguistic variables needs detailed investigations along the
different temporal points of the S-curve. This is required not only to establish the relative
importance of the variables in the ongoing propagation of change, but also to gain a deeper
understanding of how social factors interact with syntactic, semantic, and pragmatic factors in
driving linguistic change.
REFERENCES
Baayen, R. Harald. 2008. Analyzing linguistic data: A practical introduction to statistics
using R. Cambridge: Cambridge University Press.
31
Baxter, Gareth J., Richard A. Blythe, William Croft & Alan J. McKane. 2009. Modeling
language change: An evaluation of Trudgill’s theory of the emergence of New
Zealand English. Language Variation and Change 21(2), 257–296.
Blythe, Richard A. & William Croft. 2012. S-curves and the mechanisms of propagation in
language change. Language 88(2), 269–304.
Blythe, Richard A. & William A. Croft. 2009. The speech community in evolutionary
language dynamics. Language Learning 59(s1), 47–63.
Breivik, Leiv Egil. 1990. Existential there: a synchronic and diachronic study. 2nd edn. Oslo:
Novus.
Breivik, Leiv Egil. 1997. There in space and time. In Heinrich Ramisch & Kenneth Wynne
(eds.), Language in time and space: studies in honour of Wolfgang Viereck on the
occasion of his 60th birthday, 32–45. Stuttgart: Franz Steiner Verlag.
Breivik, Leiv Egil & Toril Swan. 2000. The desemanticisation of existential there in a
synchronic-diachronic perspective. In Christiane Dalton-Puffer & Nikolaus Ritt (eds.),
Words: Structure, meaning, function–A Festschrift for Dieter Kastovsky, 19–34.
Berlin: Mouton de Gruyter.
Chambers, J. K. 2002. Patterns of Variation including Change. In Chambers et al. (eds.), 349–
372.
Chambers, J. K., Peter Trudgill & Natalie Schilling-Estes (eds.). 2002. The handbook of
language variation and change. Malden, MA.: Blackwell Publishing.
Corrie, Marilyn. 2006. Middle English - Dialects and Diversity. In Mugglestone (ed.), 86–
119.
Croft, William. 2000. Explaining language change: An evolutionary approach. London:
Longman.
32
Croft, William. 2001. Radical Construction Grammar: Syntactic theory in typological
perspective. Oxford: Oxford University Press.
Davies, Mark. 2008. The Corpus of Contemporary American English (COCA): 410+ million
words, 1990-present. Brigham Young University. http://www.americancorpus.org/.
Fought, Carmen. 2002. Ethnicity. In Chambers et al. (eds.), 444–472.
Gelman, Andrew & Jennifer Hill. 2007. Data analysis using regression and multilevel /
hierarchical models. Cambridge: Cambridge University Press.
Gneuss, Helmut. 1972. The origin of Standard Old English and Æthelwold’s school at
Winchester. Anglo-Saxon England 1, 63–83.
Goldberg, Adele E. 1995. Constructions: A construction grammar approach to argument
structure. Chicago: University of Chicago Press.
Greenacre, Michael. 2007. Correspondence analysis in practice. 2nd edn. Boca Raton, FL.:
Chapman & Hall/CRC.
Greenacre, Michael & Oleg Nenadic. 2007. ca: Simple, Multiple and Joint Correspondence
Analysis. http://www.carme-n.org/.
Gretsch, Mechthild. 2009. Ælfric,Language and Winchester. In Hugh Magennis & Mary
Swan (eds.), A companion to Ælfric, 109–138. Leiden: Brill.
Gries, Stefan Th. 2008. Dispersions and adjusted frequencies in corpora. International
Journal of Corpus Linguistics 13(4), 403–437.
Gries, Stefan Th. & Martin Hilpert. 2010. Modeling diachronic change in the third person
singular: a multifactorial, verb-and author-specific exploratory approach. English
Language and Linguistics 14(3), 293–320.
Hilpert, Martin. 2013. Constructional Change in English: Developments in Allomorphy, Word
Formation, and Syntax. Cambridge: Cambridge University Press.
Hogg, Richard. 2002. An introduction to Old English. Oxford: Oxford University Press.
33
Irvine, Susan. 2006. Beginnings and Transitions: Old English. In Mugglestone (ed.), 32–60.
Jenset, Gard B. To appear. When data grow on trees: Random Forests in historical corpus
linguistics.
Jenset, Gard B. 2010. A Corpus-based Study on the Evolution of There: Statistical Analysis
and Cognitive Interpretation. Ph.D. dissertation, University of Bergen.
http://hdl.handle.net/1956/4444.
Jenset, Gard B. 2014. Mapping meaning with distributional methods: a diachronic corpus-
based study of existential there. Journal of Historical Linguistics 3(2), 272–306.
Jenset, Gard B. & Barbara McGillivray. 2012. Multivariate analyses of affix productivity in
translated English. In Michael P. Oakes & Meng Ji (eds.), Quantitative Methods in
Corpus-Based Translation Studies, 301–323. Amsterdam: Jonn Benjamins Publishing
Company.
Labov, William. 2001. Principles of Linguistic Change, vol 2: Social Factors. Oxford:
Blackwell.
Mugglestone, Lynda (ed.). 2006. The Oxford History of English. Oxford: Oxford University
Press.
Nevalainen, Terttu & Helena Raumolin-Brunberg. 2003. Historical sociolinguistics. London:
Longman.
R Development Core Team. 2011. R: A Language and Environment for Statistical
Computing. Vienna. http://www.r-project.org.
Salmons, Joe. 2012. A history of German: what the past reveals about today’s language.
Oxford: Oxford University Press.
Schmitt, Norbert. 2010. Researching Vocabulary: A Vocabulary Research Manual.
Basingstoke: Palgrave Macmillan.
34
Schneider, Edgar W. 2002. Investigating variation and change in written documents. In
Chambers et al. (eds.), 67–96.
Taylor, Ann, Anthony Warner, Susan Pintzuk & Frank Beths. 2003. The York-Toronto-
Helsinki Parsed Corpus of Old English Prose.
Trudgill, Peter. 2002. Social Differentiation. In Chambers et al. (eds.), 373–374.
Williams, Alexander. 2000. Null subjects in Middle English existentials. In S. Pintzuk, G.
Tsoulos & A. Warner (eds.), Diachronic syntax: Models and mechanisms, 285–310.
Oxford: Oxford University Press.
Top Related