Computational Linguistics - University of Maryland Institute for

Download Report

Transcript Computational Linguistics - University of Maryland Institute for

Computational Linguistics:
Computers and the Brain
University of Maryland
Yakov Kronrod
Dan Parker
Irene Eleta
Raul David Guerra
Judith Klavans
Different Strokes
Computers
Human Brain
store representations
combine previously acquired information
retrieve past “knowledge”
can “see”, “hear”, “speak”?
digital
process specific instruction
programmable
infallible (as far as programming goes)
fast complex computations
very bad at reasoning
largely serial processing
organic
can modify behavior on the fly
can learn well
prone to mistakes
slow at complex computations
very good at reasoning
largely parallel processing
What’s the connection?
• What do computers have to do with
Language?
• Holy Grain: Full Language Understanding
• NLP vs. CL
Types of Computational Linguistics and
Natural Language Processing
•
•
•
•
•
•
•
•
Speech Recognition/Comprehension
Speech Production
Language Induction
Lexical Analysis
Semantic Analysis
Machine Translation
Language Modeling
Cognitive Processing Modeling
Computational Linguistics: Defined
• Computational linguistics is an interdisciplinary field
dealing with the statistical and/or rule-based modeling of
natural language from a computational perspective.
Computational linguists often work as members of
interdisciplinary teams, including linguists (specifically
trained in linguistics), language experts (persons with some
level of ability in the languages relevant to a given project),
and computer scientists. In general, computational
linguistics draws upon the involvement of linguists,
computer scientists, experts in artificial intelligence,
mathematicians, logicians, philosophers, cognitive
scientists, cognitive psychologists, psycholinguists,
anthropologists and neuroscientists, among others.
Computational Linguistics: Defined
• Natural language processing (NLP) is a field of
computer science and linguistics concerned with
the interactions between computers and human
(natural) languages. In theory, natural-language
processing is a very attractive method of humancomputer interaction. NLP has significant overlap
with the field of computational linguistics, and is
often considered a sub-field of artificial
intelligence. Research into modern statistical NLP
algorithms requires an understanding of a
number of disparate fields, including linguistics,
computer science, and statistics.
Building Blocks: Words
• Commonly seen as the basic building block of
language
(Letters? Sounds? Noises?)
• Can express basic thoughts
(ie: give, please, that, etc…)
• Used to construct sentences
• Used to convey associations and emotions
• Placeholders for concepts and images
Words in the Brain
•
•
•
•
“Tip of the tongue” phenomena
So many to keep track of
How do we remember a word we want?
Needed to process a sentence
Digitizing the Word
• Words can be used as “tokens” to compute
many different things
• Words can be used to keep track of
associations and descriptions
• Search and Learning using words in computers
1 picture; 1000 words
But First…
…A few words from our resident Linguist.
Words in the Mind
Where are words stored?
The
Mental
LEXICON
The Mental Lexicon:
• Holds all of the words you know.
• How many words are in our lexicon?
~60,000 words!
The lexicon is not just a
bag of words…
But it’s also not
a dictionary…
Because you know so
many words, the brain
needs to find words VERY
quickly.
How might a word be accessed?
Cane
Cane
Cap
Cookie
Dog
Candl
e
How might a word be accessed?
C___
Cane
Cap
Cookie
Dog
Candl
e
How might a word be accessed?
C___
Cane
Cap
Cookie
Dog
Candl
e
How might a word be accessed?
C___
Cane
Cap
Cookie
Dog
Candl
e
How might a word be accessed?
CA__
Cane
Cap
Cookie
Dog
Candl
e
How might a word be accessed?
CA__
Cane
Cap
Cookie
Dog
Candl
e
How might a word be accessed?
CAN_
Cane
Cap
Cookie
Dog
Candl
e
How might a word be accessed?
CAN_
Cane
Cap
Cookie
Dog
Candl
e
How might a word be accessed?
CANE
Cane
Cap
Cookie
Dog
Candl
e
How might a word be accessed?
CANE
Cane
Cap
Cookie
Dog
Candl
e
Image from
http://www.psych.nyu.edu/pylkkanen/Neural_Bases/07_slides/11_LexAccess_Elect.pdf
Emotion:
Celebrate
death
birthday
negative
Party
Sorrow
Cheer
positive
joy
remorse
Title: Portrait of a Family Playing Music
Title: Portrait of a
Family Playing Music
Title: The Studio
Title: Girl at the Piano: recording
sound
abstract
cello
musicians
violin
music
colorful
piano
geometric
music
An image is worth a thousand words…
…in all languages
(or how people from different cultures build the many
meanings of an image)
US
Spain
Two women
La Virgen
Sewing
Cosiendo
Light and shadow
Cristo
Doves
Corona de espinas
Crown of thorns
Luz y sombra
Melancholy
tranquilidad
Cotton picking
two girls
African American
slavery
Racism
Southern
The South
Civil war
Museum User
Museum Expert
I remember a boat …
Boat Builders, set on the Gloucester, Massachusetts coast
relates to a series of prints and drawings devoted to
shipbuilding. Homer also used this image of the two boys
in an engraving for the October 11, 1873 issue of Harper’s
Weekly. ………..
Looking Forward
•
•
•
•
Advances in Bridging the Linguistics – NLP gap
Directions of research for language processing
Better models for human language usage
Cross-pollination of ideas across disciplines
Thank You!
• Questions
• Comments
• Discussion