Intro to the module - School of Computer Science, University of
Download
Report
Transcript Intro to the module - School of Computer Science, University of
Introduction to the Module
John Barnden
School of Computer Science
University of Birmingham
Natural Language Processing 1
2015/16 Semester 2
I/Me/Mine
• John Barnden is my name
• And natural language processing is my game ...
– Specifically and mainly: metaphor theory & processing
• I’m Professor of Artificial Intelligence
I’m also Diversity & Equality officer for the School.
• Coords:
– Room 136
– Tel. 4-3816
– [email protected]
Demonstrator
• Mohab Elkaref
[email protected]
room 218
• His job:
– Help you with any aspect of the module
– Incl.: understanding the material, getting a start on exercises (even when
assessed), using some computer programs that will be available, helping
with marking, giving a lecture on his own work.
• NB: I’m also getting three other PhD students to give lectures on
their work and the technology they use.
You
• What degrees are you on?
• Why did you choose this module?
• What have you heard about NLP?
Syllabus Page and Website
• FIND and READ the syllabus page for this module!!
• In the Relevant Links section, follow the link to my own top
webpage for the module.
– Mainly, the Canvas page will just point to that page and include the
recordings of the lectures.
• READ that top webpage.
• Lecture slides, exercises, etc. will probably hang from it,
not directly from the Canvas page.
• I will have slides up a day or two before a lecture, but probably
not more, as I like to allow lots of flexibility in class.
Assessment
• 1.5 hour exam (80%).
– NB: in its detail, will differ considerably from previous exams.
• Mid-term test (10%), on the Thursday in week 5 of this term.
– i.e. Thurs 11th Feb.
• Exercise-set as homework, Weeks 9-11 (10%)
– To be done individually, with limited collaboration (to be clarified later).
– Be aware of the plagiarism documentation in the student handbook on
the School website!!
Official Aims of Module
(plus Notes by me)
• Introduce Natural Language Processing as one of the
components of Artificial Intelligence, both from engineering and
cognitive viewpoints. Note:
– NLP gives insight into mind, and into AI in general.
• Provide a basis for the programming of NLP techniques ….
Notes:
– The module is not a software workshop, and only aims to give you abstract
algorithms and other background for NLP programming.
– Emphasis will be more on the underlying concepts, theory, problems, and
understanding of algorithms.
– But you will also be introduced to some practical tools.
More Notes on Aims of Module
• The module will largely be about processing of textual
language.
– Only occasional comments will be made about processing of speech.
– The language-processing field is largely divided into textual and speechprocessing aspects.
– Speech brings in a host of extra technical problems.
– Text processing is (more than) enough for (more than) one module!
• The main module textbook contains much information about speech
processing (optional reading).
• The module will (very briefly) mention ramifications into sign
language and manual gesture.
• There will be some attention to variations such as textese.
Unofficial Aims of Module
• Make you aware of language as a really fun think to think
about!
• To show you it acts strangely and wonderfully all around us all
the time!
• To show you it’s technically challenging to deal with, in all sorts
of fascinating ways!
Textbook and Its Relationship to Module
• Main textbook is the Jurafsky & Martin 2009 book on syllabus page.
• Plays an important role in the module.
• In many cases the lectures can only give a brief intro to a more detailed
treatment in the textbook.
• Assessed work will assume a (reasonable level of) knowledge of specified
parts of the textbook.
• Lectures will cover some things not covered in the textbook, and will further
illuminate some things that are.
• You can of course ask me or the demonstrator privately for help with
understanding textbook material.
Nature of Class Sessions
• Mainly lecture, but with
– Occasional in-class exercises (formative)
– Mid-term in-class test (assessed -- 10%).
• You are strongly encouraged to ask questions or make comments
in class.
• I will have detailed lecture slides (accessible via my module
website), but may say important things that are not on the slides.
• These slides will always be on the web.
• I will occasionally supply additional notes (electronic), including
answer notes about exercises.
What the Study of Language Covers, 1
(NB: not all covered in this module!)
• What language is, as distinct from other things we do or use.
• But also how it’s related to some such things.
• Whether other creatures use language.
• Speech aspects, textual aspects, signing aspects, gestural aspects.
• Connection of language to diagrams, pictures, music, thought ...
• Poetic and other artistic aspects of language.
• Specific purposes of language such as persuasion and intimacy-building.
• Learning/teaching of language (either naturally or deliberately).
• Development of language over history.
What the Study of Language Covers, 2
• How do we get meaning (in broadest sense, including things like
emotion) from discourse.
• How discourse is broken down into components (e..g, sentences,
phrases, words, parts of words).
• How the meaning of a phrase, sentence or complex discourse
segment depends on the meanings of the parts and other
information.
• How the above differs between: text, speech, signing, ...
• Translation between different languages.
Language Technology
• Any use of language processing by a computer system. Some main
topical examples, all of extensive, current practical importance:
– Machine translation.
– Document summarization.
– Information extraction.
– Text mining.
– Information retrieval (usually = retrieval of whole documents).
– Conversational agents, whether for
• general chat as in fronting of sites (IKEA, US Army, ...), chatrooms and artificial companions
• or for specific tasks such as booking tickets, therapy, or other life help.
– Sentiment analysis: extracting the emotional/evaluative tone of language
objects such as product reviews, customer complaints or user interactions
with an HCI system.
– Web searching.
A Standard Breakdown
• Language is traditionally (and still currently) viewed as having the
following aspects or levels:
– Phonological / orthographical (and the analogous level in sign language):
The patterns of sounds, letters or hand/body movements in basic units such as words,
and what happens to them when words (etc.) are put together
– Morphological:
Largely about how words are broken down into conceptually significant segments (i.e.
not just into letters, etc.)
– Syntactic:
The patterns of words of various types found in bigger units such as sentences.
– Semantic:
The primary meanings of words, phrases and sentences.
– Pragmatic:
More subtle and/or context-dependent aspects of the way in which meaning and
other effects arise from language. Often extends beyond sentence boundaries.
But This Breakdown is Broken-Down!
• There is no sharp distinction between morphology and syntax.
– For one thing, what counts as a word is unclear. And words can be built from other
words. The nature of the distinction varies between languages.
• The syntax/semantics distinction is somewhat difficult and theory-laden.
– Even defining what the traditional “parts of speech” (nouns, verbs, etc.) are in an
objective way is tricky, and brings in both syntax and semantics.
• The semantics/pragmatics distinction is hugely contentious and theory-laden.
– There are many different versions of what sort of meaning semantics gets at, and of
what pragmatics adds.
• Even if the breakdown could be theoretically maintained,
it would not imply that language processing would, should, or even could, be
correspondingly divided,
because of the extensive interaction between the different aspects.
Rough Set of Topics
• What counts as a word ?
• (Morphology)
• Simple Grammar and Parts of Speech (POSs)
• POS Analysis
• Syntactic Analysis
• Some Logic needed for ...
• Semantic Analysis
• Pragmatics and Other Advanced Topics
Some Intriguing Exercises
You do “Introductory Exercise-Set A.”
If there’s time, we discuss those exercises.
You do “Introductory Exercise-Set B.”
That will lead into the next segment of the module ...