powerpoint file - Intelligent Software Lab.

Download Report

Transcript powerpoint file - Intelligent Software Lab.

Statistical Methods and
Linguistics - Steven Abney
1998. 09. 24. Thur.
POSTECH Computer Science
NLP Lab 9425021
Shim Jun-Hyuk
2
Contents

Introduction

Linguistics Review under Statistical methods




Language Acquisition
Language Change
Language Variation
Language Structure and Performance







Language Property
Grammaticality and Ambiguity v. Performance
Non-Linguistic Factors for Performance
Grammaticality and Acceptability
Grammar and Computation
The Frictionless Plane, Autonomy and Isolation
Holy Grail
CS730B - Statistical NLP
3
Contents

How Statistics Helps








Objections



Disambiguation
Degrees of Grammaticality
Naturalness
Structure Preferences
Error Tolerance
Learning on the Fly
Lexical Acquisition
Are Stochastic Methods only for engineers?
Did not Chomsky debunk all this ages ago?
Conclusion
CS730B - Statistical NLP
4
Introduction

Linguistics

Computation Linguistics





Theoretical Linguistics




Performance
Practical Application
little concerned with human language processing
Rationale by the Statistical Method
Competence
Theoretical Research with grammars and structures
concerned with human language processing
Objectives

Theoretical Background of Statistical analyses

Review in the view of Linguistics

Importance of Weighted Grammar
CS730B - Statistical NLP
5
1. Linguistics Review under Statistical Models (1)


Objective

Linguistics Issues in terms of population of grammar

General population of grammar can be usefully examined by the Statistical
Models
Language Acquisition (LA)

Probabilistic(stochastic) or weighted grammar in Children’s LA Process

Co-existence and decay in grammars

Algebraic(Non-stochastic) grammar as supplementation
CS730B - Statistical NLP
6
1. Linguistics Review under Statistical Models (2)

Language Change

Change in Probability of Language Construction

EX) Rule, Parameter setting

Not “Abrupt”, but “Gradual”

Statistical Co-existence and Decay
 “Adult monolingual speaker” - finally the grammar is stochastic in
community

Language Variance

Dialectology



Typology


Arbitrary continuum of language made by geographic distance
Contact Frequency and intelligibility
EX) Language Feature, Conditional Probability distributions
Statistical Modeling using the stochastic grammar
CS730B - Statistical NLP
7
2. Language Structure and Performance (1)

Language

Algebraic Properties




Idealization - Adult monolingual Speaker
theoretical syntax - Linguistics Data
Structure judgments for competence
Statistical Properties



Stochastic Model - Performance data
adjustments on structure-judgement data for “performance effects”
grammaticality and ambiguity judgments about the sentences as opposed
to structure
CS730B - Statistical NLP
8
2. Language Structure and Performance (2)

Grammaticality and Ambiguity v. Performance

Example





The a are of I
The cows are grazing in the meadow
John saw Mary
Ambiguity Problem under Grammatical structures
Genuine ambiguities and Spurious ambiguities Problem





Is not ungrammatical but undesired analyses
case1 - elided sentence
case2 - rare Usage
The Problem is how to identify the correct structure form the possible.
Can be solved by the use of weighted grammars in computational
linguistics
CS730B - Statistical NLP
9
2. Language Structure and Performance (3)

Non-Linguistic Factors for Performance

Perception is the problem of Performance and It needs Non-Linguistic
Factors with Grammaticality

Grammaticality and Acceptability


perceptions of grammaticality and Ambiguity - Performance data
What is “Performance data” - find some choice of words and context
to get a clear positive judgment (Acceptability)

Grammar and Computation

The Problem how can we compute the linguistic data simply and
absolutely
 Competence v. Computation
Autonomy of syntax - not same as isolation and not be reduced to semantics

Holy Grail


The larger picture and ultimate goal of Generative linguistics is to
make sense of language production, comprehension, acquisition,
variation, and change
CS730B - Statistical NLP
10
3. How Statistics Helps (1)


Disambiguation (모호성 해소)

Describing an algorithm to compute the correct parse among the possible

correct parse - the parse that human perceive

various statistical methods exist

예) “John walks” - Context-free grammar with weights of rules
Degrees of Grammaticality

Gradations of acceptability

Degrees of error in speech production

Measure of goodness is a global measure that combine the degrees of
grammaticality with naturalness and structural preference

By parameter Estimation, we can get the measure of “ degrees of
grammaticality”
CS730B - Statistical NLP
11
3. How Statistics Helps (2)



Naturalness

plausibility - in the sense of selectional preferences

collocational knowledge - “how do you say it”

statistical method are applied to collocations and selectional restrictions
Structural Preference

One of the parsing strategies

longest-match preference

make an important role in the dispreference for the structure
Error tolerance

Detecting the error in sentences and select the best analysis

Primary motivations for Shannon’s noisy channel model
CS730B - Statistical NLP
12
3. How Statistics Helps (3)

Learning on the Fly

much like the error correction

to admit a space of learning operations



assigning a new part of speech to a word
adding a new subcategorization frame to verb, etc
Lexical Acquisition

the absolute richness of natural language grammars and lexica

primary area of application for distributional and statistical approaches to
acquisition

Example of distributional Approaches



acquisition of Part-of-Speech
Collocation
selectional restriction and ETC.
CS730B - Statistical NLP
13
4. Objections to Statistical Methods


Are Stochastic Models only for Engineers?

Are the stochastic models practically always a stopgap approximation?

With a complex deterministic system and the initial conditions we can
compute the state at all time

In fact, more insight and successful than identifying every deterministic
factors
What Chomsky really proves?

syntactic Structures (1957)

Chomsky : grammatical(s)  Pn(s) > E
• no choice for “n” and “E”
• Pn(s) : best n-th order approximation to English

Shannon’s MM : grammatical(s)  lim(noo) Pn(s) > E
• n increase, then erroneously assigned non-zero probability decease

Handbook of Mathematical Psychology (1963)
CS730B - Statistical NLP
14
5. Conclusion


Statistical method

weighted grammars, distributional induction methods

relevant to Linguistics
Performance v. Competence

Performance is not a goal but a useful tool of Computational Linguistics

Competence is needed to understand the algebraic properties of language

Algebraic methods are inadequate for understanding the human language

The Age of Computational Linguistics using Statistical Technology
CS730B - Statistical NLP