Introduction to Computational Linguistics

Download Report

Transcript Introduction to Computational Linguistics

Introduction to Linguistics II
Ling 2-121C, group b
Lecture 3
Eleni Miltsakaki
AUTH
Spring 2006
1
Morphology review
• What is the subject matter of morphology?
– The study of the structure of words
• What is a word?
– An arbitrary pairing of sound and meaning
• What is a morpheme?
– Building blocks of complex words
2
Morphology review
• Explain the following distinctions:
– Content words and function words
• Content concepts, open class
• Function grammatical function, closed class
– Bound and free morphemes
• Free: independent words bound: affixes
– Derivational morphology and inflectional morphology
• Derivational: root+bound morpheme=new word with new meaning
• Inflectional: root+bound morpheme= new word with marking of some
grammatical aspect
3
Morphology review
• Word formation
• How are new words created? Give an example of each
of the following categories
–
–
–
–
–
–
–
–
Word formation rules (derivations)
Coining
Compounding
Blending
Acronyms
Clippings
Backformation
Conversion
4
Morphology review
• The hierarchical structure of words
– What’s the evidence?
– How do we represent the hierarchical structure of
words?
• Think of an ambiguous word and represent the
meanings in tree diagrams
5
Syntax
• What is syntax?
– The study of sentence structure
– Video: linear order
6
Grammaticality
• Grammatical sentences are sequences of
words that conform to the rules of syntax.
• Ungrammatical sentences violate syntactic
rules
7
Grammaticality judgment
• Language speakers have intuitions about
grammaticality
– The boy found the ball
– The boy found quickly
– The boy found in the house
– The boy found the ball in the house
8
Grammaticality judgment
• The ability to make grammaticality judgments does NOT
depend on:
– Having heard the sentence before
• Enormous crickets in pink socks danced at the prom
– Whether a sentence is meaningful or not
• Colorless green ideas sleep furiously
– The truth of sentences
• The earth is flat
9
Grammaticality judgment
• Ungrammaticality
– You may understand the meaning of a
sentence and still judge it to be ungrammatical
*The boy quickly in the house the ball found
10
Ambiguity
• Syntax can also account for multiple
meanings ---AMBIGUITY
• Like words, sentences have hierarchical
structure
11
Ambiguity
• The girl saw the man with the telescope.
– (The girl) (saw) (the man with the telescope)
– (The girl) (saw) (the man) (with the telescope)
• We can “tree” the ambiguity (will do so
shortly after we look at sentence structure).
12
Practice
• Paraphrase to show the ambiguity
– The design has big squares and circles
– Terry loves his wife and so do I
– No smoking section available
– Dick finally decided on the boat
– The sheepdog is too hairy to eat
13
Sentence structure
• Syntactic rules determine the order of
words in a sentence and how the words
are grouped
– The child found the puppy
How many groupings are possible?
14
Tree diagram
15
Tree terminology
•
•
•
•
•
•
•
•
Syntactic trees are upside down
The root of the tree
The leaves of the tree
The nodes of the tree
Mother-daughter relation
Siblings: sister-sister relation
Dominate relation
Immediately dominate relation
16
Constituents
• The natural groupings of a sentence are
constituents
• Our knowledge of the constituent structure
can be represented with a tree
17
Syntactic categories
• A family of expressions that can substitute for one
another retaining grammaticality is called a syntactic
category
–
–
–
–
A police officer found the puppy in the garden
Your neighbor found the puppy in the garden
This yellow cat found the puppy in the garden
They found the puppy in the garden
• What syntactic category is the subject in the above
sentences?
• Can you think of other syntactic categories?
18
Syntactic categories
•
•
•
•
•
•
S: sentence
NP: noun phrase
VP: verb phrase
PP: prepositional phrase
AP: adjective phrase
N:noun, V: verb, P:preposition, A: adjective, D:
determiner, Adj: adjective, Adv: adverb, Aux:
auxiliary verb
19
Diagnostics for constituents
Diagnostics for phrasal constituents
•
Substitution/Pronoun substitution
– Mary loves apples.
– My sister loves everything she sees.
– Black cats detest green beans. They detest them.
•
Questions
– What do you love?
The cats/Cats with long fluffy tails.
– Where did Ali Baba go? To New York/On a long journey.
•
Relocation (movement)
– I fed the cats. The cats, I fed.
•
It-cleft focus
– I fed the cats. It was the cats that I fed.
20
Phrase structure trees
• Constituents can be represented graphically as nodes in
a tree
• A tree diagram with syntactic category information is
called a phrase structure tree
• They represent (encode) three aspects of speakers’
syntactic knowledge:
– The linear order of words
– The groupings of words into syntactic categories
– The hierarchical structure of syntactic categories
21
Practice
• Draw phrase structure trees for the following
sentences:
–
–
–
–
–
–
The puppy found the child
A frightened passenger landed the damaged plane
The house on the hill collapsed in the wind
The ice melted
The children put the toy in the box
The old tree swayed in the wind
22
Are any strings represented as constituents that shouldn't be?
Are any strings not represented as constituents that should be?
Are any of the trees misleading in other respects?
23
Heads and complements
• Phrase structure rules show relations between the
members of the phrase
• A VP, for example, contains a V which is the head of the
phrase
• The VP may contain other categories but the entire
phrase refers to what the head refers
– E.g. Put the puppy in the garden refers to the event of ‘putting’
• The other constituents in the phrase are complements
24
Heads and complements
• Every phrasal category has a head of its
same syntactic type:
– VP: V
– NP: N
– PP: P etc.
25
Practice
• Find the head and the complements of the
following NPs
– The man with the telescope
– The destruction of Rome
– A person worthy of praise
– A boy who pitched a perfect game
26
Complement selection
• Whether a verb takes more than one complement depends on the
properties of the verb
• The verb find is a transitive verb and requires an NP direct object
complement
• This information, selection, is included in the lexical entry of the word
and explains for the grammaticality judgment of the following:
– The boy found the ball
– *They boy found quickly
– *The boy found in the house
27
Complement selection
• Sleep is intransitive, it cannot take an NP
complement
– Michael slept
– *Michael slept a fish
28
Complement selection
• Think takes (selects) a clausal complement. Tell
selects for and NP and an S, feel selects an AP
or an S
–
–
–
–
–
I think that Sam won the race
I told Sam that Michael was on the bicycle
They felt strong as oxen
They feel that they can win
*They feel
29
Complement selection
• It’s not only verbs that have selectional
restrictions
• Belief selects a PP or an S
• Sympathy selects a PP
• Tired selects a PP etc
30
The infinity of language
• aka recursion
• The number of sentences in a language is infinite
• This is because sentences can be lengthened by
various means
• The heart of this linguistic property is the ability
to generate recursive structures
31
The infinity of language
• The is the farmer sowing the corn
–
–
–
–
–
–
–
–
–
–
that kept the cock that crowned in the morn,
that waked the priest all shaven and shorn,
that married the man all tattered and torn,
that kissed the maiden all forlorn,
that milked the cow with the crumpled horn,
that tossed the dog,
that worried the cat,
that killed the rat,
that ate the malt,
that lay in the house that Jack built
32
Infinity of language
• The girl with the feather on the ribbon on
the brim
• Tree
33
Infinity of language
• The repetition of categories within categories is
common in all languages and explains the infinity
of language
• Our brain capacity is finite and able to store only
a finite number of categories and rules for their
combination
• These finite means place an infinite set of
sentences at our disposal
34