Presentation (PowerPoint File)

Download Report

Transcript Presentation (PowerPoint File)

Introduction to probabilistic
models of cognition
Josh Tenenbaum
MIT
Why probabilistic models of cognition?
The fundamental problem of
cognition
How does the mind get so much out of so little?
How do we make inferences, generalizations,
models, theories and decisions about the world
from impoverished (sparse, incomplete, noisy)
data?
“The problem of induction”
Visual perception
(Marr)
Ambiguity in visual perception
• Goal of visual
perception is to recover
world structure from
visual images.
• Why the problem is
hard: many world
structures can produce
the same visual input.
• Illusions reveal the
visual system’s implicit
knowledge of the
physical world and the
processes of image
formation.
(Shepard)
Learning concepts from examples
“horse”
“horse”
“horse”
Learning concepts from examples
“tufa”
“tufa”
“tufa”
Causal inference
cold 
1 week
cold 
1 week
drug
5
1
no drug
2
6
Does this drug help you
get over a cold faster?
Don’t press
this button!
Causal inference
cold 
1 week
cold 
1 week
drug
5
1
no drug
2
6
Does this drug help you
get over a cold faster?
Don’t press
this button!
Language
• Parsing:
– Two cars were reported stolen by the Groveton police
yesterday.
– The judge sentenced the killer to die in the electric chair
for the second time.
– No one was injured in the blast, which was attributed to
a buildup of gas by one town official.
– One witness told the commissioners that she had seen
sexual intercourse taking place between two parked
cars in front of her house.
(Pinker)
Language
• Parsing
• Acquisition:
– Learning the English past tense (rule vs.
exceptions)
– Learning the Spanish or Arabic past tense
(multiple rules plus exceptions)
– Learning verb argument structure (“give” vs.
“donate”)
– Learning to be bilingual.
Intuitive theories
• Physics
– Parsing: Inferring support relations, or the causal
history and properties of an object.
– Acquisition: Learning about gravity and support.
• Gravity -- what’s that?
• Contact is sufficient
• Mass distribution and location is important
• Psychology
– Parsing: Inferring beliefs, desires, plans.
– Acquisition: Learning about agents.
• Recognizing intentionality, but without mental state reasoning
• Reasoning about beliefs and desires
• Reasoning about plans, rationality and “other minds”.
The big questions
1. How does knowledge guide inductive learning,
inference, and decision-making from sparse, noisy or
ambiguous data?
2. What are the forms and contents of our knowledge of
the world?
3. How is that knowledge itself learned from experience?
4. When faced with surprising data, when do we
assimilate the data to our current model versus
accommodate our model to the new data?
5. How can accurate inductive inferences be made
efficiently, even in the presence of complex
hypothesis spaces?
A toolkit for answering these questions
1. Bayesian inference in probabilistic generative models
2. Probabilities defined over structured representations:
graphs, grammars, predicate logic, schemas
3. Hierarchical probabilistic models, with inference at all
levels of abstraction
4. Adaptive nonparametric or “infinite” models, which
can grow in complexity or change form in response to
the observed data.
5. Approximate methods of learning and inference, such
as belief propagation, expectation-maximization (EM),
Markov chain Monte Carlo (MCMC), and sequential
Monte Carlo (particle filtering).
S  NP VP
Grammar G
NP  Det [ Adj ] Noun [ RelClause ]
RelClause  [ Rel ] NP V
VP  VP NP
P(S | G)
VP  Verb
Phrase structure S
P(U | S)
P(
Utterance U
P(S | U, G) ~ P(U | S) x P(S | G)
Bottom-up
Top-down
“Universal Grammar”
Hierarchical phrase structure
grammars (e.g., CFG, HPSG, TAG)
P(grammar | UG)
Grammar
P(phrase structure | grammar)
Phrase structure
P(utterance | phrase structure)
Utterance
P(speech | utterance)
Speech signal
S  NP VP
NP  Det [ Adj ] Noun [ RelClause ]
RelClause  [ Rel ] NP V
VP  VP NP
VP  Verb
Vision as probabilistic parsing
(Han and Zhu, 2006)
Learning word meanings
Principles
Structure
Data
Whole-object principle
Shape bias
Taxonomic principle
Contrast principle
Basic-level bias
Causal learning and reasoning
Principles
Structure
Data
Goal-directed action
(production and comprehension)
(Wolpert et al., 2003)
Why probabilistic models of cognition?
• A framework for understanding how the mind can solve
fundamental problems of induction.
• Strong, principled quantitative models of human cognition.
• Tools for studying people’s implicit knowledge of the world.
• Beyond classic limiting dichotomies: “structure vs. statistics”,
“nature vs. nurture”, “domain-general vs. domain-specific” .
• A unifying mathematical language for all of the cognitive
sciences: AI, machine learning and statistics, psychology,
neuroscience, philosophy, linguistics…. A bridge between
engineering and “reverse-engineering”.
Why now? Much recent progress, in computational resources,
theoretical tools, and interdisciplinary connections.
Summer school plan
• Weekly plan
– Week 1: Basic probabilistic models. Applications to
visual perception, categorization, causal learning.
– Week 2: More advanced probabilistic models
(grammars, logic, MDPs). Applications to reasoning,
language, scene understanding, decision-making,
neuroscience.
– Week 3: Further applications to memory, motor
control, sensory integration, unsupervised learning and
cognitive development. Symposia on open challenges
and student research.
Summer school plan
• Daily plan
– 5 (or 6) lectures per day.
– Starting Wednesday, break-out sessions after lunch, for
discussion with speakers.
– Evening tutorials:
Matlab, Probability basics, Bayes net toolbox (for matlab),
SamIam, BUGS, Markov logic networks and Alchemy.
– Psych computer lab (available afternoons).
– Self-organizing activities:
Sign up for calendar on 30boxes.com:
Email address: [email protected]
Password: “ipam07”
Background poll
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Bayes’ rule
Conjugate prior
Bayesian network
Plate notation for graphical models
Mixture model
Hidden Markov model
Expectation-maximization (EM) algorithm
Dynamic programming
Gaussian processes
Dirichlet processes
First-order logic
(Stochastic) context-free grammar
Probabilistic relational models
MCMC
Particle filtering
Partially observable Markov decision process
Serotonin
Poll for tonight
• Matlab tutorial?
• Probability basics?