Transcript Document
Syntax
Why is the structure of language (syntax) important?
How do we represent syntax?
What does an example grammar for English look like?
What strategies exist to find the structure in natural
language?
A Prolog program to recognise English sentences
Pattern matching as an alternative (eg. Eliza)
This uses a database of input output pairs.
The input part of pair is a template to be matched against the user
input
The output part of the pair is given as a response.
X computers Y => Do computers interest you?
X mother Y => Tell me more about your family?
But…
Nothing is known about structure (syntax)
I X you => Why do you X me?
Fine for X = like, but not for X = do not know
Nothing is known about meaning (semantics)
I feel X => I'm sorry you feel X.
Fine for X = depressed, but not for X = happy
Syntax shows the role of words in a sentence.
John hit Sue
vs
Sue hit John
Here knowing the subject allows us to know what is
going on.
Syntax shows how words are related in a sentence.
Visiting aunts ARE boring.
vs
Visiting aunts IS boring.
Subject verb agreement allows us to disambiguate here.
Syntax shows how words are related between sentences.
(a) Italy was beating England. Germany too.
(b) Italy was being beaten by England. Germany too.
Here missing parts of a sentence does not allow us to
understand the second sentence.
But syntax allows us to see what is missing.
But syntax alone is not enough
Visiting museums can be boring
This is not ambiguous for us, as we know there is no such
thing as a "visiting museum", but syntax cannot show this
to a computer.
Compare with…
Visiting aunts can be boring
How do we represent syntax?
Parse Tree
How do we represent syntax?
List
Sue hit John
[ s, [np, [proper_noun, Sue] ] ,
[vp, [v, hit],
[np, [proper_noun, John] ]
What does an example grammar for English look like?
Re-write rules
sentence
2.noun phrase
3.noun phrase
4.verb phrase
5.verb phrase
1.
->
->
->
->
->
noun phrase , verb phrase
art , noun
art , adj , noun
verb
verb , noun phrase
Chomsky Hierarchy
0 Unrestricted
A
1 Context-Sensitive
| LHS | | RHS |
2 Context-Free
|LHS | = 1
3 Regular
|RHS| = 1 or 2 , A a | aB | Ba
What Makes a Good Grammar?
Generality
Selectivity
Understandability
Generality of Grammars
Regular
{abd, ad, bcd, b, abcd, …}
S -> a S1 | b S2 | c S3 | d
S1 -> b S2 | c S3 | d
S2 -> c S3 | d
S3 -> d
Context Free
{anbn}
S -> ab | a S b
Context Sensetive
{ anbncn} or {abcddabcdd, abab, asease, …}
Matching Constituents
* I ate hamburger and on the Stove
* I ate a cold hot dog and well burned
* I ate hot dog slowly and a hamburger
John hitting of Mary alarmed Sue
I cannot explain John hitting of Mary
Sue was alarmed by John hitting of Mary
Testing Constituents
1)
I looked up John’s phone number
I looked up John’s chimney
2)
* I looked up John’s phone number and in the cupboard
I looked up John’s chimney and in his cupboard
* up John’s phone number, I looked
up john’s chimney, I looked
An example:
Parsing sentence:
"They are cooking apples."
Parse 1
Parse 2
What strategies exist for trying to find the
structure in natural language?
Top Down vs. Bottom Up
Bottom - Up
John, hit, the, cat
prpn, hit, the, cat
prpn, v, the, cat
prpn, v, det, cat
prpn, v, det, n
np, v, det, n
np, v, np
np, vp
s
Better if many alternative rules for a
phrase
Worse if many alternative terminal
symbols for each word
Top - Down
s
s -> np, vp
s -> prpn, vp
s -> John, v, np
s -> John, hit, np
s -> John, hit, det,n
s -> John, hit, the,n
s -> John, hit, the,cat
Better if many alternative terminal
symbols for each word
Worse if many alternative rules
for a phrase
Top down parsing
1
The
Step
2
dog 3 cried
Current state
4
1
2
3
((S) 1)
((NP VP) 1)
((ART N VP) 1)
4
((N VP) 2)
5
((VP) 3)
6
((V) 3)
7
Backup States
comment
initial position
Rule 1
Rules 2 & 3
((ART ADJ N VP) 1)
Match Art with the
((ART ADJ N VP) 1)
Match N with dog
((ART ADJ N VP) 1)
Rules 5 & 6
((V NP) 3)
((ART ADJ N VP) 1)
Success
Parsing as a search procedure
1.
Select the first state from the possibilities list
(and remove it from the list).
Generate the new states by trying every possible option
from the selected state
(there may be none if we are on a bad path).
2.
Add the states generated in step 2 to the possibilities list
3.
What strategies exist for trying to find the
structure in natural language?
Depth First vs. Breadth First
Depth First
Try rules one at a time
and back track if you get
stuck
Easier to program
Less memory required
Good if parse tree is deep
Breadth First
Try all rules at the same
time
Can be faster
Order of rules is not
important
Good if tree is flat
An Example of Top-Down Parsing
1 The 2 old 3 man 4 cried 5
Depth First Search versus Breadth First
What strategies exist for trying to find the
structure in natural language?
Left - Right vs. Right – Left
Left - Right
Take words from left to right
Take rule constituents from left to right
Right - Left
Take words from right to left
Take rule constituents from right to left
Left - Right usually best for a language like English
where subject comes before verb ; good for subject - verb
agreement; speech & real time input is L->R; closer to
human processing.
We have trouble with "Have the students given their
assignments by their lecturers" for this reason.
What strategies exist for trying to find the
structure in natural language?
A simple Prolog parser's strategy
Top Down
Prolog tries to satisfy the query "Is this a sentence?" and
works top down in a search for resolution of this query.
Depth first
Prolog's in built search strategy is depth first, so a simple
parser uses this.
L->R
The grammar rules are taken from left to right, and so are
the words of phrases / sentences as Prolog tries to match
them against the grammar rules.
What does a Prolog program look like that tries
to recognise English sentences?
s --> np vp.
np --> det n.
np --> det adj n.
vp --> v np.
What does a Prolog program look like that tries
to recognise English sentences?
sentence(S) :noun_phrase(NP), verb_phrase(VP),
append(NP,VP,S).
noun_phrase(NP) :determiner(D),noun(N),append(D,N,NP).
noun_phrase(NP) :determiner(D),adj(A),noun(N),append(D,A,AP),appen
d(AP,N,NP).
verb_phrase(VP) :verb(V), noun_phrase(NP), append(V,NP,VP).
determiner([D]) :- member(D,[the,a,an]).
noun([N]) :- member(N,[cat,dog,mat,meat,fish]).
adj([A]) :- member(A,[big,fat,red]).
verb([V]) :- member(V,[ate,saw,killed,pushed]).