CASP_Lisbon_1m - University of California, Davis — Linguistics

Download Report

Transcript CASP_Lisbon_1m - University of California, Davis — Linguistics

Multiple Factors in
Second Language
Acquisition:
The CASP Model
Luna Filipović (University of East
Anglia) & John A. Hawkins
(University of California Davis &
University of Cambridge)
Introduction






Second language acquisition involves the
interplay of a number of factors that can either
facilitate or impede learning. These include:
the typological relationship between L1 and L2
general principles of learning and critical ages
general principles of language processing
(production and comprehension)
social factors involving the general environment
for learning, as well as
pedagogical factors including teaching methods
and materials and types of assessment.



Our goal in this talk: to outline a broadly based set of
principles for a multi-factor model of learning in second
language acquisition.
We have developed these principles by drawing on
theoretical insights from numerous branches of the
language sciences, grammatical theory, typology,
language processing, computational linguistics, first
language acquisition and second language acquisition.
We illustrate our findings using some data-driven studies in
the field including our own, as reported in our recent book
Criterial Features in L2 English (CUP, Cambridge).
The criterial feature concept is a new one that we
use to describe the relative stages of second
language acquisition.
Criterial features are grammatical and lexical properties
of an L2 (constructions, words and their meanings, rule
types, errors and their frequencies, etc) that are distinctive
and characteristic of proficiency at different levels of
learning, e.g. at the six levels of proficiency described by
the Common European Framework (CEFR).
Large electronic corpora of learner English make it
possible for us to discover these features.
Our 2012 book uses the Cambridge Learner Corpus (CLC), a
corpus of 40 million words of examination scripts at levels 2-6
(CEFR A2-C2) from learners of English around the world who speak
many different first languages.
The CLC has been tagged for parts of speech and parsed using a
sophisticated automatic parser (Briscoe et al. 2006) permitting
numerous grammatical and lexical searches to be conducted.
Between a third and one half has been error-coded using codes
devised by researchers at CUP.
An analogy for Criterial Features
Think of the defining characteristics for recognising faces in a police
identikit. You don’t need to see all the features of a person’s face in
order to distinguish that person from others, just the important
characteristics that capture essential qualities
For example, Noun Phrase sequences of Pronoun plus
Infinitive are found at level 2 (A2):
something to eat
level 2
nothing to do
level 2
New features found at level 3 (B1) involve more complex
syntax, e.g. an Object Control structure such as:
I ordered him [to gather my men to the hall]
level 3
This is a criterial construction for level 3 and higher
levels and distinguishes them from levels 1 and 2.
A new feature for level 4 (B2) includes the socalled “Pseudocleft” structure with an initial what
functioning as subject of its verb:
What fascinated me was [that I was able to lie on the sea surface]
level 4
“Subject-to-Object Raising” constructions with the verb
believe appear first at level 5 (C1) and are criterial for this
and the next level:
I believe her [to be this country’s best representative]
level 5

A Subject to Object Raising construction with the verb
presume is not found before level 6:
He presumed work to be the way to live
level 6
Some criterial features are negative rather than
positive, i.e. incorrect properties or errors that
occur with a characteristic frequency for a
particular level or levels (the “error bandwidth”
for that level).
We can now ask: WHY do we see these
patterns in the data and why do we see the
criterial features changing the way they do
at the different levels? In particular, WHAT
is it about the features of the higher
proficiency levels that makes them late
acquired rather than early?
It cannot be that learners are simply imitating
the words and constructions they are taught
in their textbooks.
First, because there are many different textbooks
and teaching methods around the world.
Second, because learners learn more than they are
explicitly taught, from their reading materials, papers,
magazines, movies, TV, conversations, and so on.
I.e. second language learning shares many similarities
with first language learning, but not all obviously.





We now have enough principles of learning with empirical support for
their precise formulation and interaction to propose a model of SLA in
the spirit of a complex adaptive system (Gell-Mann 1992)
In this model multiple factors interact to produce a range of observable
outcomes and different kinds of interlanguages
Some of our principles have parallels in first language learning (Slobin
1977, Tomasello 2003, Diessel 2004, MacWhinney 2005).
The model is called “CASP”, short for complex adaptive system
principles of SLA.
Some other recent models of SLA are also multi-factor and
interactionist (e.g. Larsen-Freeman and Cameron 2008, Ellis & LarsenFreeman 2009, MacWhinney 2005). Ours is symbolic rather than
connectionist, defined in terms of grammatical and lexical primitives,
typologically and psycholinguistically informed, and empirically driven
CASP: General Principles





(A) Minimize Learning Effort (MiL)
Learners of a second language (L2) prefer to minimize
learning effort when they learn the grammatical and
lexical properties of the L2.
Learning effort is minimized when grammatical and lexical
properties are shared between L1 and L2 and can be transferred
directly into the L2, exploiting pre-existing knowledge from the L1.
It is minimized when properties of the L2 are frequently occurring in
the L2 input, which increases their exposure to the learner.
It is minimized when structural and semantic properties of the L2 are
simple rather than complex.



(B) Minimize Processing Effort (MiP)
Learners of a second language (L2) prefer to minimize
processing effort when they use the grammatical and
lexical properties of the L2, just as native speakers do.
There is a debate in the SLA literature over the extent to which
developmental stages in SLA are shaped by ease of processing or
by ease of learning. Our view is that there is a need for both, given
certain dissociations between them, even though their predictions
often overlap.



Principles (A) and (B) are principles of least effort. If these were the
only principles determining learning and production our learner
corpora would reveal increasingly minimal outputs.
Clearly, they do not. MLUs (i.e. mean length of utterance figures)
increase at higher proficiency levels (cf. Hawkins & Filipović
2012:ch.2.2) as greater use is made of less frequent and more
complex structures and meanings.
The reason is that learners are trying to increase their expressive
power in the L2, and to behave like native speakers, which means
learning and using the mix of infrequent and frequent, and complex
and simple, linguistic items, just like native speakers do.


( C ) Maximize Expressive Power (MaE)
Learners of a second language (L2) prefer to maximize
their expressive power, i.e. to formulate in the L2
whatever thoughts they would wish to express in the L1,
and to perform the same language functions as L1
users.
Successive stages of acquisition reveal more native-like L2 outputs
with increasingly complex and less frequent structures for the
expression of increasingly complex thoughts, in partial opposition to
principles (A) MiL and (B) MiP.




(D) Maximize Communicative Efficiency (MaC)
Learners of a second language (L2) prefer to maximize
their communicative efficiency in relation to the hearer
and his/her mental model.
Communication is efficient when the message (M) intended by the
speaker (S) is calibrated to the hearer's (H) mental model in such a
way as to achieve accurate comprehension of M with rapid speed.
This requires sometimes more, sometimes less, processing effort,
in partial opposition to principle (B) MiP.
CASP: Specific Principles
(1) Maximize Positive Transfer (MaPT)
Properties of the L1 which are also present in the
L2 are learned more easily and with less learning
effort, and are readily transferred, on account of
pre-existing knowledge in L1.
Shared L1/L2 properties should result in earlier L2 acquisition, in more
of the relevant properties being learned, and in fewer errors, unless
these shared properties involve e.g. high complexity and are impacted
by other factors.
Principle (1) (MaPT) derives from general principle (A) MiL.
Positive transfers are also good for processing since processing
mechanisms used for production and comprehension in the L1 can be
applied directly to the L2, see (B) MiP.
Positive transfers also enhance the expressive power of the nonnative L2 user, see (C) MaE.
Communicative efficiency can also be maximized by positive transfers,
see (D) MaC.
The next slide shows missing determiner error rates for
“the” and “a” at successive proficiency levels for French,
German and Spanish learners of English, and for
Turkish, Japanese and Russian learners.
I.e. errors such as I spoke to President (missing the)
I have car (missing a)
The first three L1s have an article system similar to
English, the latter three do not..
The figures indicate the percentage of errors with respect to the total
number of correct uses. E.g. a percentage of 10.0% means a
determiner was omitted 1 in every 10 times it should have appeared.
Determiner Error Rates in L2 English: Missing “the”
Levels
French
German
Spanish
2
4.76
0.00
3.37
3
4.67
2.56
3.62
4
5.01
4.11
4.76
5
3.11
3.11
3.22
6
2.13
1.60
2.21
Levels
Turkish
Japanese
Russian
2
22.06
27.66
14.63
3
20.75
25.91
22.73
4
21.32
18.72
18.45
5
14.44
13.80
14.62
6
7.56
9.32
9.57
I.e. generally low error rates for French, German and Spanish, without
significant deviation between levels.
Turkish, Japanese and Russian show much higher error rates, with
significant improvements especially at levels 5 and 6.
These figures support principle (1) MaPT.
Determiner Error Rates in L2 English: Missing “a”
Levels
French
German
Spanish
2
6.60
0.89
4.52
3
4.79
2.90
4.28
4
6.56
3.83
7.91
5
4.76
3.62
5.16
6
3.41
2.02
3.58
Levels
Turkish
Japanese
Russian
2
24.29
35.09
21.71
3
27.63
34.80
30.17
4
32.48
24.26
26.37
5
23.89
27.41
20.82
6
11.86
15.56
12.69
These figures show a similar pattern to the last slide, and similar
support for (1) MaPT.
(2) Maximize Frequently Occurring Properties (MaF)
Properties of the L2 are learned in proportion to their
frequency of occurrence (e.g. as measured in the
BNC): more frequent exposure of a property to the
learner facilitates its learning and reduces learning
effort.
I.e. more frequent properties will result in earlier L2
acquisition, more of the relevant properties learned,
and fewer errors, in general.




Consider the basic sentence types and constructions of English
described in terms of “verb co-occurrence frames” (Williams 2007,
Hawkins & Filipović 2012).
At the earliest level (2) we find simple and frequently occurring
intransitive sentences types (he went), transitive types (he loved
her), and basic three-place predicate types with a prepositional
phrase (she added the flowers to the bouquet).
At the higher levels (3, 4 and above) we see more complex and less
frequent sentence types, for example different embeddings (he
explained how to do it, he asked whether he should come, he told
the audience that he was leaving) and gerundive verbs with –ing (I
caught him stealing, they worried about him drinking).
There is a precise correlation between the order of acquisition and
degree of frequency in the input, cf. (2) MaF:
Average Token Frequencies in Native English Corpora
(including BNC) for New Verb Co-occurrence Frames at
the Learner Levels
Levels
BNC frequency
2
3
1,041,634
38,174
4/5/6
27,615
(3) Maximize Structurally and Semantically Simple
Properties (MaS)
Properties of the L2 are learned in proportion to their
structural and semantic simplicity: simplicity means
there are fewer properties to be learned and less
learning effort is required.
I.e. simpler properties will result in earlier L2 acquisition, more of
the relevant properties learned, and fewer errors, in general.
Cf. Hawkins (2004, 2009) for issues in language complexity.
E.g. simpler consonants and consonantal distinctions in a phonological
inventory are acquired earlier than more complex ones (see e.g.
Eckman 1984).
Simpler relative clause constructions are acquired earlier than more
complex ones (Hawkins & Filipović 2012).
Simpler and more basic meanings for verbs are acquired earlier
than more complex and derived extensions in meaning, figurative
uses, etc
The verb break in its basic physical sense at level 2;
break in the sense of INTERRUPT (break the routine)
level 3;
break an agreement, promise, etc. level 4;
break the bank (idiomatic) level 5;
break the wall that surrounds him (original figurative)
level 6.
Cf. Hawkins & Filipović (2012:ch.8)



(4) Permit Negative Transfer (PNT)
Properties of the L1 which are not present in the L2 can
be transferred, resulting in errors, as learners strive to
achieve an expressive power and communicative
efficiency in L2 comparable to that in their L1 (see
principles C and D), while minimizing learning effort
(principle A) and/or processing effort (principle B).
I.e. when grammatical and lexical properties are shared, transfers
from L1 into L2 result in positive or correct properties in the L2.
When properties are not shared, and the transfer still takes place,
this results in negative or incorrect properties in the L2.




Positive transfers are maximized on account of general principles
(A)-(D).
Negative transfers are different. Sometimes they occur and
sometimes they don’t, i.e. they are not maximized. This is one of the
big unresolved issues in current SLA. When will they occur and
when not? The general and more specific principles of our model
can help us understand this.
We see negative transfers as motivated by the desire to maximize
expressive power (principle (C) MaE) and also to maximize
communicative efficiency (principle (D) MaC) in an L2 system that
has been incompletely learned, while at the same time minimizing
learning effort (principle (A) MiL) and processing effort (principle (B)
MiP).
the same general forces that result in positive transfers lead to
negative transfers as well.




But the major difference is that there are severe limitations on
expressive power and on communicative efficiency that can be
conveyed by linguistic properties that are not part of the L2 and not
used by its native speakers.
When native speakers communicate with L2 learners they tolerate
and compensate for departures from the native language
conventions. But when learners depart too radically from these
conventions, they are not understood by native speakers.
Learners accordingly acquire a sensitivity to the native speaker’s
ability to compensate for these violations in conventions of grammar
and use.
This, we believe, plays a major role in determining whether and
when negative transfer can occur.



In phonology substitutions of L1 consonants like [t] or [s] or [f] for L2
[θ] in English thin minimize learning and processing effort for
learners whose L1s do not have this consonant, generally with
communicative success (Lado 1957).
In syntax Spanish Pro-Drop (e.g. *is a beautiful country for it is a
beautiful country) is often transferred into early L2 English to
express the proposition in question and the removal of the subject
does not impede communicative success. This structure is simpler
than its English counterpart with an overt subject, and transfer is not
blocked, as predicted by our principle (3) MaS.
Similarly, many article omission errors do not diminish expressive
power and communicative success, and at the same time they
minimize learning and processing effort through the transfer of L1
structures, see above.





By contrast, Chinese prenominal relative clauses do not result in
errors whereby the English man whom the woman loves is changed
into its Chinese prenominal counterpart *the woman loves whom
man
This Chinese structure is complex and typologically marked crosslinguistically (Hawkins 2004).
Complex or marked structures in an L1 without an L2 equivalent will
not generally transfer negatively, on account of principle (3) MaS
(cf. Eckman 2011).
Similarly complex or less frequent structures and meanings may not
transfer from L1 to an L2 even when they are shared, in opposition
to principle (1) MaPT.
What this means is that principles (2) MaF and (3) MaS can block
both positive and negative transfers into L2.

We propose principle (5), which derives from the need for
communicative efficiency (principle (C) MaC), and which
ultimately reflects the sensitivity of learners to their native
speaking interlocutors and the latter’s tolerance for errors.

(5) Communicative Blocking of Negative Transfer (CBN)
The transfer of negative properties from L1 to L2 is
filtered in proportion to communicative efficiency
(principle D): the more an L1 property impedes efficient
communication in L2, the less negative transfer there is.
Consider the basic word orders of English and Japanese.
These languages have mirror-image patterns, head-initial
versus head-final, that are both frequent and productive
across languages: [went [to [the cinema]]] versus [[[the
cinema] to] went], Greenberg 1966; Dryer 1992; Hawkins 1983,
1994, 2004.
 Head-final orders are not transferred into L2 English by Japanese
learners because, we argue, that would result in extreme
communicative inefficiency: speakers using Japanese word orders
in English L2 would simply not be understood! By contrast, headinitial word order variants of Spanish that lack precise counterparts
in English (e.g., I read yesterday the book) are negatively
transferred into L2 English, since they do not impact efficient
communication.


We predict that because Japanese is a head-final language, the
contrast with the mirror-image word order patterns of English is
considerable and transferring head-final patterns into a head-initial
language like English, and vice versa, would significantly impair
communication. This is why it is imperative for Japanese learners of
English, and English learners of Japanese, to acquire correct basic
word orders in their L2s early.
But speakers of L1 languages with flexible SVO like Spanish do not
have the same incentive, because even when they transfer incorrect
orders from their L1s into a fundamentally similar head-initial English
L2, communication is not significantly impaired.

(6) Order of Second Language Acquisition (OSLA)

The order of acquisition for properties of the L2 is in accordance
with general principles (A)-(D), and with the more specific principles
and patterns that are supported empirically. These principles can be
incorporated within a multi-factor model of SLA, the CASP model,
and used to define possible versus impossible, and likely versus
unlikely, interlanguage stages proceeding from a given L1 to a given
L2.

These principles operate collectively to make constrained
predictions for the acquisition of properties of L2 English and of
other languages, and for their relative sequencing. Their interaction
is complex, because there are several such principles, which
sometimes compete and sometimes cooperate, because they are
gradient, and because they have different relative strengths.
Key sources:
Filipović, L. & J.A. Hawkins (forthcoming 2013) ‘Multiple factors in second
language acquisition: The CASP model’
Hawkins, J.A. & L. Filipović (2012) Criterial Features in L2 English:
Specifying the Reference Levels of the Common European Framework.
CUP, Cambridge.
References
Briscoe, E., J. Carroll and R. Watson (2006) ‘The second release of the RASP
system’. In Proceedings of the COLING/ACL 2006 Interactive Presentation
Sessions, Sydney, Australia.
Council of Europe (2001) Common European Framework of Reference for Languages:
Learning, teaching, assessment. CUP, Cambridge.
Diessel, H. (2004) The Acquisition of Complex Sentences. CUP, Cambridge.
Dryer, M.S (1992) ‘The Greenbergian word order correlations’, Language 68: 81-138.
Eckman, F.R. (1984) ‘Universals, typologies, and interlanguage’. In: W.E. Rutherford,
ed., Language Universals and Second Language Acquisition, John Benjamins,
Amsterdam, 79-105.
Eckman, F.R. (2011) ‘Linguistic typology and second language acquisition’. In: J.J.
Song, ed., The Oxford Handbook of Linguistic Typology, OUP Oxford, 618-633.
Ellis, N.C. & D. Larsen-Freeman (2009) ‘Constructing a second language: Analyses
and computational simulations of the emergence of linguistic constructions from
usage’. In N.C. Ellis & D. Larsen- Freeman (eds.), Language as a Complex
Adaptive System, Wiley-Blackwell, Chichester, 90-125.
Gell-Mann, M. (1992) ‘Complexity and complex adaptive systems’. In J.A. Hawkins & M. GellMann, eds., The Evolution of Human Languages, Addison-Wesley, Redwood City, CA.
Greenberg, Joseph H. (1966) ‘Some universals of grammar with particular reference to the
order of meaningful elements’. In J.H. Greenberg, ed., Universals of Language, MIT Press,
Cambridge, Mass., 73-113.
Hawkins, J.A. (1983) Word Order Universals. Academic Press, New York.
Hawkins, J.A. (1994) A Performance Theory of Order and Constituency. CUP, Cambridge.
Hawkins, J.A. (2004) Efficiency and Complexity in Grammars. OUP, Oxford.
Hawkins, J.A. (2009) ‘An efficiency theory of complexity and related phenomena’. In D. Gil,
G. Sampson & P. Trudgill, eds., Complexity as an Evolving Variable, OUP, Oxford, 252268.
Lado, R. (1957) Linguistics Across Cultures: Applied Linguistics for Language Teachers.
University of Michigan, Ann Arbor, Michigan.
Larsen-Freeman, D. & L. Cameron (2008) Complex Systems in Applied Linguistics. OUP,
Oxford.
MacWhinney, B. (2005) ‘A unified model of language acquisition’. In J.F. Kroll & A.M.B. de
Groot, eds., Handbook of Bilingualism: Psycholinguistic Approaches, OUP, Oxford, 49Slobin, D.I. (1977) ‘Language in childhood and in history’. In J. Macnamara, ed., Language
Learning and Thought: Perspectives in Neurolinguistics and Psycholinguistics, University
of Maryland Press, Maryland, 185-214.
Tomasello, M. (2003) Constructing a Language: A Usage-based Theory of Language
Acquisition. Harvard University Press, Cambridge, Mass.
Acknowledgments
We acknowledge, with gratitude, the following sources of financial
support for the research reported in this paper:
Funding from Cambridge Assessment and Cambridge University Press
to both authors for completion of the 2012 CUP book Criterial Features
in L2 English.
A Leverhulme and Newton Trust Postdoctoral Research Fellowship to
the first author (at the University of Cambridge 2008-2011).
Research funds to the second author from the Research Centre for
English and Applied Linguistics University of Cambridge (2008-2011),
including a teaching buyout at the University of California Davis (20082010), and research funds from UC Davis (2008-2010), including a UC
Davis Seed Grant for International Outreach.