Chunking, continued

Download Report

Transcript Chunking, continued

SIMS 290-2:
Applied Natural Language Processing
Marti Hearst
Sept 22, 2004
1
Today
Cascaded Chunking
Example of Using Chunking: Word Associations
Evaluating Chunking
Going to the next level: Parsing
2
Cascaded Chunking
Goal: create chunks that include other chunks
Examples:
PP consists of preposition + NP
VP consists of verb followed by PPs or NPs
How to make it work in NLTK
The tutorial is a bit confusing, I attempt to clarify
3
Creating Cascaded Chunkers
Start with a sentence token
A list of words with parts of speech assigned
Create a fresh one or use one from a corpus
4
Creating Cascaded Chunkers
Create a set of chunk parsers
One for each chunk type
Each one takes as input some kind of list of tokens, and
produced as output a NEW list of tokens
– You can decide what this new list is called
 Examples: NP-CHUNK, PP-CHUNK, VP-CHUNK
– You can also decide what to name each occurrence of the
chunk type, as it is assigned to a subset of tokens
 Examples: NP, VP, PP
How to match higher-level tags?
It just seems to match their string description
So best be certain that their name does not overlap with
POS tags too
5
6
7
8
Let’s do some text analysis
Let’s try this on more complex sentences
First, read in part of a corpus
Then, count how often each word occurs with each POS
Determine some common verbs, choose one
Make a list of sentences containing that verb
Test out the chunker on them; examine further
9
10
11
12
Why didn’t this parse work?
13
Why didn’t this parse work?
14
15
Why didn’t this parse work?
Why didn’t this parse work?
16
Corpus Analysis for Discovery of
Word Associations
Classic paper by Church & Hanks showed how to use a corpus
and a shallow parser to find interesting dependencies between
words
– Word Association Norms, Mutual Information, and Lexicography,
Computational Linguistics, 16(1), 1991
– http://www.research.att.com/~kwc/publications.html
Some cognitive evidence:
Word association norms: which word to people say most
often after hearing another word
– Given doctor: nurse, sick, health, medicine, hospital…
People respond more quickly to a word if they’ve seen an
associated word
– E.g., if you show “bread” they’re faster at recognizing “butter”
than “nurse” (vs a nonsense string)
17
Corpus Analysis for Discovery of
Word Associations
Idea: use a corpus to estimate word associations
Association ratio: log ( P(x,y) / P(x)P(y) )
The probability of seeing x followed by y vs. the probably
of seeing x anywhere times the probability of seeing y
anywhere
P(x) is how often x appears in the corpus
P(x,y) is how often y follows x within w words
Interesting associations with “doctor”:
–
–
–
–
–
–
X:
X:
X:
X:
X:
X:
honorary Y: doctor
doctors Y: dentists
doctors Y: nurses
doctors Y: treating
examined Y:doctor
doctors
Y: treat
18
Corpus Analysis for Discovery of
Word Associations
Now let’s make use of syntactic information.
Look at which words and syntactic forms follow a given
verb, to see what kinds of arguments it takes
Compute triples of subject-verb-object
Example: nouns that appear as the object of the verb
usage of “drink”:
– martinis, cup_water, champagne, beverage, cup_coffee,
cognac, beer, cup, coffee, toast, alcohol…
– What can we note about many of these words?
Example: verbs that have “telephone” in their object:
– sit_by, disconnect, answer, hang_up, tap, pick_up, return,
be_by, spot, repeat, place, receive, install, be_on
19
Corpus Analysis for Discovery of
Word Associations
The approach has become standard
Entire collections available
Dekang Lin’s Dependency Database
– Given a word, retrieve words that had dependency
relationship with the input word
Dependency-based Word Similarity
– Given a word, retrieve the words that are most similar
to it, based on dependencies
http://www.cs.ualberta.ca/~lindek/demos.htm
20
Example Dependency Database:
“sell”
21
Example Dependency-based
Similarity: “sell”
22
Homework Assignment
Choose a verb of interest
Analyze the context in which the verb appears
Can use any corpus you like
– Can train a tagger and run it on some fresh text
Example: What kinds of arguments does it take?
Improve on my chunking rules to get better
characterizations
23
Evaluating the Chunker
Why not just use accuracy?
Accuracy = #correct/total number
Definitions
Total:
number of chunks in gold standard
Guessed: set of chunks that were labeled
Correct: of the guessed, which were correct
Missed: how many correct chunks not guessed?
Precision: #correct / #guessed
Recall:
#correct / #total
F-measure: 2 * (Prec*Recall) / (Prec + Recall)
24
Example
Assume the following numbers
Total:
100
Guessed: 120
Correct: 80
Missed:
20
Precision: 80 / 120 = 0.67
Recall:
80 / 100 = 0.80
F-measure: 2 * (.67*.80) / (.67 + .80) = 0.69
25
Evaluating in NLTK
We have some already chunked text from the Treebank
The code below uses the existing parse to compare
against, and to generate Tokens of type word/tag to parse
with our own chunker.
Have to add location information so the evaluation code can
compare which words have been assigned which labels
26
How to get better accuracy?
Use a full syntactic parser
These days the probabilistic ones work surprisingly well
They are getting faster too.
Prof. Dan Klein’s is very good and easy to run
– http://nlp.stanford.edu/downloads/lex-parser.shtml
27
28
29
30
31
Next Week
Shallow Parsing Assignment
Due on Wed Sept 29
Next week:
Read paper on end-of-sentence disambiguation
Presley and Barbara lecturing on categorization
We will read the categorization tutorial the following
week
32