THE BIG PICTURE

Download Report

Transcript THE BIG PICTURE

THE BIG
PICTURE
Basic Assumptions
• Linguistics is the empirical science that
studies language (or linguistic behavior)
• Linguistics proposes theories (models) that
can be verified or falsified against linguistic
data
• Computational linguistics is the branch of
linguistics that uses computational models
• Natural language processing (NLP) is the
engineering equivalent of computational
linguistics (as civil engineering is to physics)
Empirical Matter
The Big Picture
?
Formalisms
•Data structures
•Formalisms
•Algorithms
•Distributional Models
?
?
Maud expects
there to be a
riot
*Teri
promised there
to be a riot
Maud expects
the shit to hit
the fan
*Teri
promised the
shit to hit the
?
Linguistic Theory
Empirical Matter: Language and/or Linguistic Behavior
Brain
Scans
Grammaticality
Judgments
Maud expects there to be a
riot
*Teri promised there to be
a riot
Maud expects the shit to
hit the fan
*Teri promised the shit to
hit the fan
Corpora
Psycholinguistic
Experiments
Empirical Matter: Language and/or Linguistic Behavior
(?)
Grammaticality
Judgments
Maud expects there to be a
riot
*Teri promised there to be
a riot
Maud expects the shit to
hit the fan
*Teri promised the shit to
hit the fan
Corpora
Psycholinguistic
Experiments
Brain
Scans
Underlying Empirical
Object of Study
• What is linguistics really “about”?
o The brain (cognitive science)
o Language as an abstract structure
(structuralism)
Empirical Matter
The Big Picture
or
Formalisms
•Data structures
•Formalisms
•Algorithms
•Distributional Models
?
Maud expects
there to be a
riot
*Teri
promised there
to be a riot
Maud expects
the shit to hit
the fan
*Teri
promised the
shit to hit the
?
?
Linguistic Theory
Mathematical Formalisms (1)
Data Structures Formalisms Algorithms
Phrase-structure
trees
• Dependency trees
• Dags
• …
•
•
•
•
•
•
CFG
TAG
Dependency
grammars
Unification
grammars
…
Distributional
Models
Chart parsing
• Bottom-up • Probabilistic
• Top-down
CFG
• …
• Probabilistic
TAG
• Deterministic
parsing
• …
• LR
• …
• Generation
• …
•
Mathematical Formalisms (2)
• Exist in formal computer science,
mathematics, statistics…
• Exist independently of natural language
• Do not on their own attempt to model or
explain natural language
• Do not on their own succeed in modeling or
explaining natural language
Empirical Matter
The Big Picture
or
Formalisms
•Data structures
•Formalisms
•Algorithms
•Distributional Models
Maud expects
there to be a
riot
*Teri
promised there
to be a riot
Maud expects
the shit to hit
the fan
*Teri
promised the
shit to hit the
?
?
Linguistic Theory
Linguistic Theory
• Phonetics: articulated sounds
• Phonology: how do sounds form minimal
•
•
•
•
meaning units (morphemes)?
Morphology: how do morphemes form
words?
Syntax: how do words form utterances?
Semantics: what is the meaning of
utterances?
Pragmatics: in what context do we use which
utterance?
Goal of Syntactic Theory (1)
• Goal (version 1): formulate theory of
how words form utterances (sentences,
in written language)
• Goal (version 2): formulate theory of
how words in linear sequence combine
to form utterances
• Utterance represented by non-linear
structure (e.g., a tree)
Goal of Syntactic Theory (2)
• Goal (version 3): formulate theory of how
words in linear sequence correspond to
structures
• Assumption: semantics interprets this
structure as meaning -- in particular,
predicate-argument structure
• Goal (version 4): formulate theory of how
words in linear sequence correspond to
predicate-argument structures
Goal of Syntactic Theory (3)
• What is predicate-argument structure?
seem
John seems to like apples
like
John
apples
• Deep dependency-like structure!
Goal of Syntactic Theory (4)
• So what role does phrase-structure play?
• “Augmented” representation of linear
order
S
NP
John
Vi
V
seems
S
NP
t
Vi
V
NP
like
apples
Goal of Syntactic Theory (5)
• Goal (version 5, final for us): formulate
theory of how phrase structure of
sentences relates to their deep
dependency
• Goal (version 5 – dependency theories):
formulate theory of how surface
dependency of sentences relates to
their deep dependency
The Big Picture
Empirical Matter
or
Formalisms
•Data structures
•Formalisms
•Algorithms
•Distributional Models
Maud expects
there to be a
riot
*Teri
promised there
to be a riot
Maud expects
the shit to hit
the fan
*Teri
promised the
shit to hit the
?
theory of
Linguistic Theory
Components of a
Syntactic Theory (1)
• Definition of surface representation
o Choice of data structure/formalism/…
o List of node labels, rules, etc.
• Definition of deep representation
o Choice of data structure/formalism/…
o List of node labels, rules, etc.
• Description of correspondence
o Choice of formal mechanism
o List of rules (?)
Components of a
Syntactic Theory (2)
• Formal Framework
o Definition of surface representation

o
Definition of deep representation

o
Choice of data structure/formalism/…
Choice of data structure/formalism/…
Description of correspondence

Choice of formal mechanism
• Linguistic Content
o Definition of surface representation

o
Definition of deep representation

o
List of node labels, rules, etc.
List of node labels, rules, etc.
Description of correspondence

List of rules (?)
Empirical Matter
The Big Picture
or
Formalisms
•Data structures
•Formalisms
•Algorithms
•Distributional Models
Maud expects
there to be a
riot
*Teri
promised there
to be a riot
Maud expects
the shit to hit
the fan
*Teri
promised the
shit to hit the
uses
theory of
Linguistic Theory
Content
•S urf ace re
p resentation( eg
, p s)
• Dee
p re
p resentation( eg
, de
p )
• Corresp ondence
Note on
Competence vs Performance
• Performance: human sentence processing
o Language use
o Interacts with other parts of cognition: memory,
emotions, etc
o Studied in psychology, data from experiments
• Competence: human knowledge of syntax
that allows performance
o
o
What we have been and will be talking about in
this course, largely
Studied in linguistics, data from grammaticality
judgments and corpora
• Distinction debatable
Content of a
Syntactic Theory (1)
• Defeasible predictive theory:
o Have theory
o Needs to be able to make predictions (=deductions)
o Predictions need to be verifiable or falsifiable
against empirical matter
o When prediction is falsified, theory needs to be
changed


Formal framework
And/or linguistic content
• “Hypothetico-deductive method” (Popper)
Content of a
Syntactic Theory (2)
• What exactly is being predicted?
o Set of allowable surface representations:

o
Is predicted sentence in the language?
AND correspondence between surface
representation and deep representation:

Is the predicted correspondence plausible?
• What is scope of theory?
o One language: descriptive theory
o All languages: explanatory theory (Chomsky)
Descriptive Theory
• Theory for one language, which is fixed
• Predicts what surface structures (i.e.,
strings) are grammatical
• Predicts, for a given grammatical string
(and its surface representation) its deep
representation
Explanatory Theory (1)
• Need to predict, given a language, what its
surface structures and corresponding deep
structures are
• Need a parameterized theory
• Chomsky (1981, etc):
o
o
Principles: things that hold for all languages
Parameters: values differ for different languages
• “Principles-and-paremeters” type theory also
used by other researchers (TAG, HPSG, LFG)
Explanatory Theory (2)
Example
Formalism
Linguistic
Content
Principles
• CFG with slash
• Set of nonterminal
Parameters
• Number of slashes
• Whether or not VP is
categories
• Head percolation
algorithm
•…
allowed per
nonterminal
•…
symbols
• Some rules for head
percolation algorithm
•…
used is standard
declarative clause
• Rules of CFG
•…
Linguistic Theories and
Empiricial Matter
• What is predicted? How can the theory
be falsified?
o
Behavior of observable data
• What is the theory “about”?
o Descriptive theory: language as structure
o Explanatory theory: presumably, cognition
The Big Picture
Empirical Matter
Final, for now
or
Formalisms
•Data structures
•Formalisms
•Algorithms
•Distributional Models
uses
descriptive
theory is
about
Maud expects
there to be a
riot
*Teri
promised there
to be a riot
Maud expects
the shit to hit
the fan
*Teri
promised the
shit to hit the
predicts
Linguistic Theory
Content
•S urf ace re
p resentation( eg
, p s)
• Dee
p re
p resentation( eg
, de
p )
• Corresp ondence
explanatory
theory is about
The Big Picture
E
m
p
i
r
i
c
a
l
M
a
t
t
e
r
Final, for now
or
descriptive
•
•Distributional Models
A
l
g
o
r
i
t
h
m
s
*
F
•
D
•
F
a
o
t
r
a
m
s
a
t
l
o
r
i
Maud expects
there to be a
riot
u
s
r
c
m
t
m
u
r
a
e
l
i
s
m
s
s
T
e
r
i
promised there
to be a riot
Maud expects
the shit to hit
the fan
*Teri
promised the
shit to hit the
s
uses
Linguistic Theory
theory is
a
Content
•S urf ace re
p resentation( eg
, p s)
• Dee
p re
p resentation( eg
, de
p )
• Corresp ondence
b
o
u
t
predicts
The Big Picture
Empirical Matter
Final, for now
or
Formalisms
•Data structures
•Formalisms
•Algorithms
•Distributional Models
uses
descriptive
theory is
about
Maud expects
there to be a
riot
*Teri
promised there
to be a riot
Maud expects
the shit to hit
the fan
*Teri
promised the
shit to hit the
predicts
Linguistic Theory
In rest of course
Content
•S urf ace re
p resentation( eg
, p s)
• Dee
p re
p resentation( eg
, de
p )
• Corresp ondence
explanatory
theory is about