Developing Statistic-based and Rule

Download Report

Transcript Developing Statistic-based and Rule

Developing Statistic-based and
Rule-based Grammar Checkers for
Chinese ESL Learners
Howard Chen
Department of English
National Taiwan Normal University
[email protected]
1
The Needs to Provide Feedback on
Second Language Writing



More and more tests ask ESL/EFL students
to demonstrate their writing abilities
SLA Researchers would suggest that
learners would need more practices and
corrective feedback.
However, who can provide them useful
feedback on meaning and forms?
2
Use the Existing Grammar Checkers?


Teachers are the best feedback providers.
However, so many essays to correct….
 Microsoft grammar checker
 General impressions from ESL/EFL learners=
it is NOT very useful.
 The two new commercial packages: Vantage
MyAccess and ETS Criterion

The feedback quality for ESL learners are not so accurate and
comprehensive. (perhaps because it does not target at any
L1 group and it is mainly targeted at native speakers)
3
A More Through Review on E-rater- ETS
Criterion



Japanese college researcher Junko Otoshi (2005)
from Ritsumeikan University
Use 28 Japanese adult students’ TOEFL writing
essays to explore what Criterion can and cannot do
with regard to providing feedback on the essays.
Criterion’s critique function was compared with a
human instructor’s error feedback focusing on five
error categories: verbs, word choice, nouns, articles,
and sentence structures.
4
Errors Marked by Criterion and Human
Instructors (Means)








Error Type Criterion
Verbs
0.47
Nouns
0.00
Articles
0.07
Word Choice
0.11
Sentence Structure
0.32
Human Instructors
0.84
0.94
2.00
2.32
6.31
5
Rather Disappointing Results and Possible
Reasons




The results revealed that Criterion
experienced difficulties in detecting errors in
all of the five categories.
Does it aim for higher accuracy and has
lower recall? More conservative approach
The size the reference corpus?
Another program MyAccess has similar
problems, though the general impression
from review reports was that they can detect
more errors.
6
Trying to Combine Different Approaches:
Plan A and B for Grammar Checkers


With the funding from NSC in Taiwan, we
planned to develop two grammar checkers.
Different approaches= parser-rules-statistics




Plan A: we will use the ngram to help to identify
the errors
Plan B: we will use the rule-based grammar
checker to identify errors.
If possible, plan A and B will be merged and it
should be able to capture more errors.
In this paper, we will only discuss the plan A.
7
What’s the Ngram (statistical) Checker?




We will not write specific grammar rules.
The computer helps to calculate all the
possible combinations of word strings (2word and 3-word) in a very large native
corpus. Language models building.
All these saved to a large database.
Then when students write and submit an
essay to the ngram checker, the system can
quickly detect the word strings that do not
exist in the native corpus.
8
Ngram-based Checker: advantages




The key idea is simple but powerful
No need to write rule
More robust in detecting errors.
Large and suitable corpus might make this
very useful. (ETS, they used 30-million
news)
9
The Procedure of Developing an Ngram
Checker (corpora and tools)





1. Find suitable and large corpus (e.g BNC;
wikipedia, and Google)
2. Extract the ngrams (NLP tools SRI tool )
3. Build a large ngram database
4. Develop and test different highlighting
methods
5. Highlight the possibly problematic ngrams
in learners’ writing
10
Grammar Checker Online
The links
 http://140.122.83.250:4000/main (BNC)
 http://140.122.83.250/search.php (Google)
 http://140.122.83.245/ngram-check/ (BNC)
11
The Web Interface of Ngram Checker
12
13
14
A Simple Example
15
Evaluate the Checker Performances: Any
Standard Way of Evaluating Checkers?






What kind of errors should be used to test the
grammar checker?
Fair assessment- same set of sentences.
How many sentences?
Many different categories and errors
Lexical factors.
NLP researchers: F-measure and precision
and recall
16
Test with CLEC Corpus from China




The size of the Chinese learners of English
Corpus.
1 million error-tagged learner corpus.
With about 60 error types.
We decided to single out some sentences (10
sentences) from the learner corpus and then
throw them into our ngram checkers.
17
1. Form
18
2. Verb Phrases (Tense)
19
3. Noun Phrases
20
4. Pronouns
21
5. Adjective Phrases
22
6. Prepositions- seems to be a difficult
area
23
7. Conjuncts Errors
24
8. Word Errors
25
9. Collocation Errors
26
10. Sentence Structure Errors
27
The Strengths of NTNU Ngram Checkers:







Ngram is good at detecting errors in the “local”
or adjacent domains. It can indeed find many
errors in CLEC.
Spellings
Word forms
Verb phrases
Noun phrases
Adj phrases
Collocations
28
The Weakness of Ngram Checkers

It failed to catch the followings effectively:







Tense errors
Conjuncts errors
Fragments
Pronoun errors
Preposition errors
The run on sentences
The missing words
29
The Poor Performance of Ngram
Checkers for Tense and Conjuncts
30
Rule-based Checker can Perform Better
for Some Nonlocal Errors
31
Wintertree Grammar Checker
32
BUT Ngram Performed Better for the
Local Errors

I have some book. The informations are so rich.
These researches are excellent. He is new friend.
He cutted his finger. He enjoys to eat. He wants
jumping into the river. I cannot decided about this.
These reason are too simple. I has three answers.
33
What Can We Do to Improve Feedback
from Ngram Checkers?
Only Highlighting and No detailed feedback??
 We are facing a bigger challenge.
 How to recommend correct usage? How we can find
the correct examples for students?
 If students only see the errors highlighted, they
might still fail to correct the errors.
For agreement errors, tense errors, confusing words,
Students might be able to self-correct.
However, if there are some tense errors, collocations
errors or preposition errors, learners might need
more specific suggestions.

34
Find the Proper Collocates: increase and
improve life
35
Confusion between accept and receive
your apology
36
Future Directions for Improvement
1.
2.
3.
4.
Test with many different errors and find the
strengths and limitations of Ngram-based
checkers and Rule-based checkers
Use Tagged learner corpus to find the error
patterns from learner languages
Feedback can be added in for ngram-based
Checkers on the major error patterns
Better integration of the rule- based system
and ngram checkers
37
Thanks for your attention
 Questions and Discussions
 [email protected]
 National Taiwan Normal University

38