Texture, Contours and Regions: Cue Integration in Image

Download Report

Transcript Texture, Contours and Regions: Cue Integration in Image

Towards Human Level AI
Jitendra Malik
U.C. Berkeley
This talk
• Approaches to computational intelligence
• Review of human intelligence and child
development
• Review of computer vision and its problems
• The problem of visual categories
• Proposed agenda
Paradigms for mechanizing
intelligence ~1960
• Classic AI (McCarthy, Minsky, Newell, Simon)
– Games, theorem-proving, reasoning
– Search, represent and reason in first-order logic
• Pattern Recognition/Neural Networks (Rosenblatt)
– Classification, Associative memory
– Learning (Perceptrons …)
• Estimation and Control (Kalman)
– Decide action in uncertain, time-varying environment
– Markov Decision Processes, adaptive control …
Achievements (1960-1990)
• Classic AI (McCarthy, Minsky, Newell, Simon)
– Chess programs, planning sytems, first generation
expert systems
– [[ relational databases ]]
• Pattern Recognition/Neural Networks (Rosenblatt)
– Various applications of MLPs and other learning
techniques
– HMMs for speech recognition
• Estimation and Control (Kalman)
– Man on Moon
– Self-driving cars …
John von Neumann’s warning
•
As a mathematical discipline travels far from its empirical source, or still more, if it is a
second and third generation only indirectly inspired from ideas coming from 'reality', it
is beset with very grave dangers. It becomes more and more purely aestheticizing, more
and more purely l'art pour l'art. This need not be bad, if the field is surrounded by
correlated subjects, which still have closer empirical connections, or if the discipline is
under the influence of men with an exceptionally well-developed taste.
"But there is a grave danger that the subject will develop along the line of least resistance,
that the stream, so far from its source, will separate into a multitude of insignificant
branches, and that the discipline will become a disorganized mass of details and
complexities.
In other words, at a great distance from its empirical source, or after much 'abstract'
inbreeding, a mathematical subject is in danger of degeneration. At the inception the
style is usually classical; when it shows signs of becoming baroque the danger signal is
up. It would be easy to give examples, to trace specific evolutions into the baroque and
the very high baroque, but this would be too technical.
In any event, whenever this stage is reached, the only remedy seems to me to be the
rejuvenating return to the source: the reinjection of more or less directly empirical ideas.
I am convinced that this is a necessary condition to conserve the freshness and the
vitality of the subject, and that this will remain so in the future.".
Brain Sub-Systems
• Sensory
–
–
–
–
Vision (30-50%)
Audition
Somatic
Chemical (Taste, Smell)
• Motor
– Manipulation
– Locomotion
– Speech
• Language
• Central
– Planning and problem solving
– …..
What do we know from human
child development?
• It is nature AND nurture
Visual Development
• Axon growth guided by chemical gradients
(in turn due to gene expression)
• Critical period for development of
orientation selectivity (Hubel & Wiesel)
• New-born babies sensitive to faces
• Visual tracking ~ 3mo
• Binocularity/Stereopsis ~4mo
Language Development
• Babbling & tuning phonemes
• Developing link between words and objects
•
Words refer to objects
•
They denote categories
•
Objects have only one name
• Slow between 12 and 18 mo (median no. words
at 20 mo is 169) and very rapid afterwards (6 yrs 13k)
• Word pairs (18-24 mo)
• Grammar takes off after that
Cognitive development
• Categorization
• Perception/reality distinction
• Self-Awareness (mirror test)
Many types of memory
• Semantic memory
• Episodic memory
• Skill memory
The Hilbert Problems of Computer
Vision
Jitendra Malik
Forty years of computer vision
1963-2003
• 1960s: Beginnings in artificial intelligence, image
processing and pattern recognition
• 1970s: Foundational work on image formation: Horn,
Koenderink, Longuet-Higgins …
• 1980s: Vision as applied mathematics: geometry, multiscale analysis, control theory, optimization …
• 1990s:
– Geometric analysis largely completed
– Probabilistic/Learning approaches in full swing
– Successful applications in graphics, biometrics, HCI …
And now …
• Back to basics: the classic problem of
understanding the scene from its image/s
• Central question: Interplay of bottom-up
and top-down information
Early Vision
• What can we learn from image statistics that
we didn't know already?
• How far can bottom-up image segmentation
go?
• How do we make inferences from shading
and texture patterns in natural images?
Static Scene Understanding
• What is the interaction between
segmentation and recognition?
• What is the interaction between scenes,
objects, and parts?
• What is the role of design vs. learning in
recognition systems?
Dynamic Scene Understanding
• What is the role of high-level knowledge in
long range motion correspondence?
• How do we find and track articulated
structures?
• How do we represent "movemes" and
actions?
Proposed Research Agenda
Child Language Acquisition with visual input