shall - University College London

Download Report

Transcript shall - University College London

English Corpus Linguistics
Introducing the Diachronic Corpus of
Present-Day Spoken English (DCPSE)
Sean Wallis
UCL
Barber (1964): changes in English grammar
a.
b.
c.
d.
e.
f.
g.
h.
A tendency to regularize irregular morphology (e.g. dreamt- dreamed);
A revival of the “mandative” subjunctive, probably inspired by formal US
usage (we demand that she take part in the meeting);
Elimination of shall as a future marker in the first person;
Development of new, auxiliary-like uses of certain lexical verbs (e.g. get,
want – cf., e.g., The way you look, you wanna / want to see a doctor soon);
Extension of the progressive to new constructions, e.g. modal, present
perfect and past perfect passive progressive (the road would not be being
built/ has not been being built/ had not been being built before the
general elections);
Increase in the number and types of multi-word verbs (phrasal verbs,
have/take/give a ride, etc.);
Placement of frequency adverbs before auxiliary verbs (even if no emphasis
is intended – I never have said so);
Do-support for have (have you any money? and no, I haven’t any money do you have/ have you got any money? and no, I don’t have any money/
haven’t got any money)…
The Diachronic Corpus of Present-day
Spoken English (DCPSE)
– Orthographically transcribed spoken BrE
– Fully parsed
• every ‘sentence’ has a tree diagram
• searchable with ICECUP and FTFs
– 400,000+ words each from
• London-Lund Corpus (aka The ‘Survey Corpus’)
• ICE-GB
– Balanced by text category
– Not evenly distributed by year
• LLC: samples from 1958-1977
• ICE-GB: 1990-1992
Tree diagrams
A tree diagram for the sentence We’re getting there.
Barber on shall and will
• [T]he distinctions formerly made between shall and will are being lost,
and will is coming increasingly to be used instead of shall. One reason
for this is that in speech we very often say neither [will] nor [shall], but
just [’ll]: I’ll see you to-morrow, we’ll meet you at the station, John’ll get
it for you. We cannot use this weak form in all positions (not at the end
of a phrase, for example), but we use it very often; and, whatever its
historical origin may have been (probably from will), we now use it
indiscriminately as a weak form for either shall or will; and very often
the speaker could not tell you which he had intended. There is thus
often a doubt in a speaker’s mind whether will or shall is the
appropriate form; and, in this doubt, it is will that is spreading at the
expense of shall, presumably because will is used more frequently
than shall anyway, and so is likely to be the winner in a levelling
process. So people nowadays commonly say or write I will be there,
we will all die one day, and so on, when they intend to express simple
futurity and not volition.
(Barber 1964: 134)
Denison on shall and will
• During the latter part of our period [1776-present
day] ... in the first person shall has increasingly
been replaced by will even where there is no
element of volition in the meaning.
(Denison 1998: 167)
The use of shall and will in written British and
American English from the 1960s and 1990s
BrE
FLOB
2,723
200
LL
diff %
1.2 -2.7%
44.3 -43.7%
AmE Brown Frown
will
2,702 2,402
shall
267
150
LL
diff %
17.3 -11.1%
33.1 -43.8%
will
shall
LOB
2,798
355
From: Mair and Leech (2006: 327)
• Figures are normalised per million word frequencies
• Log likelihood LL is performed against number of words
Mair and Leech’s data
• Simply counts tagged lexical tokens
– Will = auxiliary verb, includes ’ll
– Shall = auxiliary verb
– Includes negative forms
• Does not distinguish by grammatical position or context
– Does not ask whether the choice is available, e.g. limit to first
person use
– Does not consider subclasses separately
• Negative cases: will not/won’t vs. shall not/shan’t?
• Do interrogative cases behave differently?
• Is written data only
• Can we do better than this?
An FTF for first person declarative shall
• This FTF is limited to first person cases
– The FTF requires that the NP is realised by the pronoun I or we.
• Interrogative cases have a different structure
• We can subtract negative (shall not) cases to exclude
them.
Shall vs. will
• Does the proportion of cases of shall out of {shall, will} change
over time?
2(shall)
2(will)
shall
will
Total
110
78
188
1.32
1.45
d% = -30.24% 20.84%
ICE-GB
40
58
98
2.53
2.79
 = 0.17
TOTAL
150
136
286
3.85
4.24
2 = 8.09
LLC
•
Summary
² for first person subject; shall vs will
d% =
percentage difference (30% fall in shall between LLC and ICE-GB)
=
an estimate of the size of the overall effect (a bit like d%)
2 =
2x2 chi-square test: is this change statistically significant?
2(shall) = 2x1 goodness of fit test: does shall behave differently to average?
Shall vs. will/’ll
• Does the proportion of cases of shall out of {shall, will, ’ll} change
over time?
2(will)
2(’ll)
9.98
0.13
2.33
453
11.98
0.16
2.80
997
21.96
0.30
5.13
shall
will
’ll
Total
104
69
371
544
ICE-GB
36
52
365
TOTAL
140
121
736
LLC
•
2(shall)
² for first person subject; shall vs will vs. ’ll
2(shall) = 2x1 goodness of fit test: does shall behave differently to average?
Focusing on choice
• We focused on the choice of shall vs. will
– Mair and Leech simply said that total cases of shall fell
– But this might have happened for other reasons
• For example there may have been more opportunities to use shall in
the LLC data
• Examining choice is a more precise way of conducting
experiments than counting frequencies
– It allows us to consider what variables (time, genre, other choices)
affect the probability of shall being chosen
• Probability is a simple fraction from 0 to 1.
– p(shall) =
F(shall)
F(shall) + F(will)+…
Probability of shall vs. will over time
Probability of shall vs. will/’ll over time
Confidence intervals
• Probability p(shall):
0 = no cases are of type shall
1 = all cases are of type shall
• Our sample is a tiny subset of possible sentences
from the same period
– So we cannot say a particular observation is certain
– Instead we try to estimate our confidence in an
observation using error bars or confidence intervals
• The more data we have supporting an observation p,
the smaller the confidence interval around it
• We set a confidence level, typically of 95%
– we are 95% sure that the true value is within the interval
Modal meaning
•
Remember Barber and Denison. Not all cases of
shall or will mean the same thing
– Root (futurity):
•
•
I’ve got some at home so I shall take it home. [DI-A18 #30]
I will answer you in a minute. [DI-B30 #293]
– Epistemic (volition):
•
•
•
So I shall have roughly from the twenty-ninth of June to the
eighth of July on which I can spend the whole of that time on
those two papers. [DL-B01 #62]
It’s certainly my long term hope that I will have some kind of
companion... [DI-B53 #0257]
We should examine these choices separately
– Unfortunately this means classifying cases manually
Modal meaning: statistics
Root
shall
will
Total
LLC
ICE-GB
LLC
ICE-GB
33
22
44
37
136
%
30.84
59.46
55.70
66.07
Epistemic %
72
14
28
14
128
67.29
37.84  sig
35.44
25.00
 sig
Unclear %
2
1
7
5
15
1.87
2.70
8.86
8.93
Total
107
37
79
56
279
• Root shall / will is stable: results are not significant
• Epistemic shall / will falls (d% = -30% 27%)
– The fall in shall is not explained by the sharp fall in Epistemic modals
overall - from 100 (72+28) to 28 (14+14)
– This is evidence that the shift in use in C20 is concentrated within
Epistemic meanings, from shall to will.
– Barber and Denison: earlier shift was in Root (future) meaning.
Modal meaning: statistics
Root
shall
will
Total
LLC
ICE-GB
LLC
ICE-GB
33
22
44
37
136
%
30.84
59.46
55.70
66.07
Epistemic %
72
14
28
14
128
67.29
37.84  sig
35.44
25.00
 sig
Unclear %
2
1
7
5
15
1.87
2.70
8.86
8.93
Total
107
37
79
56
279
• Shall is losing its particular Epistemic meaning as a result
– In the LLC data two thirds (67%) of shall uses were Epistemic.
– This fell to 37% (just over one third) in ICE-GB.
Conclusions
• DCPSE is
– orthographically transcribed spoken English
• mostly spontaneous
– fully parsed and checked by linguists, uses phrase structure
grammar based on Quirk et al.
– searchable with ICECUP and FTFs
• Even lexical studies benefit from parsing
– allows us to focus on when a choice occurs
• You can use DCPSE to carry out many different
experiments on real English
– we looked at change over (recent) time
– we might also look at how decisions interact
Conclusions
• Designing a Corpus Linguistic experiment means thinking
carefully about your hypothesis and then attempting to test
it against the corpus
– We examined the shift from shall to will
– We limited it to first person, declarative, positive cases
– Changing baselines (including ’ll) may lead to different conclusions
• Many corpus studies only consider word baselines (or pmw)
• But it is often better to consider proportions of types of clause or
phrase, or list specific alternative choices
– Alternation (choice) studies aim to hold meaning constant so the
speaker/writer is free to choose between both cases:
• We focused further by subdividing data by modal meaning
Suggested further reading
• On shall vs. will and the progressive:
– Aarts, B. Close, J. and Wallis S.A. (forthcoming) Choices over time:
methodological issues in investigating current change.
In: B. Aarts et al. The changing Verb Phrase, Cambridge: CUP.
• www.ucl.ac.uk/english-usage/projects/verb-phrase/book/aartsclosewallis.pdf
– Barber, C. (1964) Linguistic Change in Present-Day English. Edinburgh
and London: Oliver and Boyd.
– Denison, D. (1998) Syntax. In: S. Romaine (ed.). The Cambridge History
of the English Language. IV: 1776-1997. Cambridge: Cambridge
University Press. 92-329.
– Mair, C. and Leech, G. (2006) Current changes in English syntax.
In: B. Aarts and A. McMahon (ed.) The Handbook of English Linguistics.
Malden MA: Blackwell Publishers. 318-342.
• On statistical tests, confidence intervals and other methods:
– Wallis, S.A. (2010) z-squared: the origin and use of 2. Survey of English
Usage, UCL.
• www.ucl.ac.uk/english-usage/statspapers/z-squared.pdf