Spell checking using the Google Web API Su Zhang Email: sz24

Download Report

Transcript Spell checking using the Google Web API Su Zhang Email: sz24

Spell checking using the
Google Web API
Su Zhang Email: [email protected]
Problems outlined
Most existing spell checking systems make
little use of in-sentence context. In other
words, they often replace misspellings with
unusual and inappropriate spelling
suggestions. Moreover, they have problems
of identifying and correcting real-word
errors.
Recently, some experimental systems
attempt to use context, but the requirement
to store the text-base on the user’s machine
limits them.
Applying Google Web API as
web text repository
This is an alternative approach, the new
spell checking program applies this
technique is able to get the text on the fly.
It breaks input text into n-word fragments. Then
it searches the occurrence of each n-word
fragment on Google and identifies bad
fragments and misspellings by comparing the
number of each fragment’s returned results.
Lastly, it provides the users with candidate
corrections.
e.g. He is going to fine out. (Single real-
word error case)
This sentence is divided into 3-word fragments:
a. He is going (1,350,000 returned results)
b. is going to (101,000,000 returned results)
c. going to fine (37,100 returned results)
d. to fine out. (33,500 returned results)
Since fragment d has the least number of
returned results, it is identified as the worst
fragment. And it contains the misspelling which
is the word “fine”.